You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/02/18 19:47:27 UTC

[GitHub] [airflow] hsrocks opened a new pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

hsrocks opened a new pull request #21673:
URL: https://github.com/apache/airflow/pull/21673


   This PR implements operator for Sagemaker operator and hooks to delete the Sagemaker model 
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks edited a comment on pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks edited a comment on pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#issuecomment-1050419819


   > Looks like my points have been addressed other than that new question about the indenting of the START/END flag in the sample DAG, which may or may not be a concern. Thanks for the changes.
   
   Thanks a lot for review :) No it does not failed for me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r813398178



##########
File path: docs/apache-airflow-providers-amazon/operators/sagemaker.rst
##########
@@ -0,0 +1,71 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Sagemaker Operators

Review comment:
       ```suggestion
   Amazon SageMaker Operators
   ```
   
   Nitpick:  Here and elsewhere, `Amazon SageMaker` is the proper/official naming convention.

##########
File path: airflow/providers/amazon/aws/example_dags/example_sagemaker.py
##########
@@ -0,0 +1,179 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from os import environ
+
+from airflow import DAG
+from airflow.providers.amazon.aws.operators.sagemaker import (
+    SageMakerDeleteModelOperator,
+    SageMakerModelOperator,
+    SageMakerProcessingOperator,
+    SageMakerTrainingOperator,
+    SageMakerTransformOperator,
+)
+
+model_name = "sample_model"
+training_job_name = 'sample_training'
+image_uri = environ.get('ECR_IMAGE_URI', '123456789012.dkr.ecr.us-east-1.amazonaws.com/repo_name')
+s3_bucket = environ.get('BUCKET_NAME', 'test-airflow-12345')
+role = environ.get('SAGEMAKER_ROLE_ARN', 'arn:aws:iam::123456789012:role/role_name')
+
+sagemaker_processing_job_config = {
+    "ProcessingJobName": "sample_processing_job",
+    "ProcessingInputs": [
+        {
+            "InputName": "input",
+            "AppManaged": False,
+            "S3Input": {
+                "S3Uri": f"s3://{s3_bucket}/preprocessing/input/",
+                "LocalPath": "/opt/ml/processing/input/",
+                "S3DataType": "S3Prefix",
+                "S3InputMode": "File",
+                "S3DataDistributionType": "FullyReplicated",
+                "S3CompressionType": "None",
+            },
+        },
+    ],
+    "ProcessingOutputConfig": {
+        "Outputs": [
+            {
+                "OutputName": "output",
+                "S3Output": {
+                    "S3Uri": f"s3://{s3_bucket}/preprocessing/output/",
+                    "LocalPath": "/opt/ml/processing/output/",
+                    "S3UploadMode": "EndOfJob",
+                },
+                "AppManaged": False,
+            }
+        ]
+    },
+    "ProcessingResources": {
+        "ClusterConfig": {
+            "InstanceCount": 1,
+            "InstanceType": "ml.m5.large",
+            "VolumeSizeInGB": 5,
+        }
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 3600},
+    "AppSpecification": {
+        "ImageUri": f"{image_uri}",
+        "ContainerEntrypoint": ["python3", "./preprocessing.py"],
+    },
+    "RoleArn": f"{role}",
+}
+
+sagemaker_training_job_config = {
+    "AlgorithmSpecification": {
+        "TrainingImage": f"{image_uri}",
+        "TrainingInputMode": "File",
+    },
+    "InputDataConfig": [
+        {
+            "ChannelName": "config",
+            "DataSource": {
+                "S3DataSource": {
+                    "S3DataType": "S3Prefix",
+                    "S3Uri": f"s3://{s3_bucket}/config/",
+                    "S3DataDistributionType": "FullyReplicated",
+                }
+            },
+            "CompressionType": "None",
+            "RecordWrapperType": "None",
+        },
+    ],
+    "OutputDataConfig": {
+        "KmsKeyId": "",
+        "S3OutputPath": f"s3://{s3_bucket}/training/",
+    },
+    "ResourceConfig": {
+        "InstanceType": "ml.m5.large",
+        "InstanceCount": 1,
+        "VolumeSizeInGB": 5,
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 6000},
+    "RoleArn": f"{role}",
+    "EnableNetworkIsolation": False,
+    "EnableInterContainerTrafficEncryption": False,
+    "EnableManagedSpotTraining": False,
+    "TrainingJobName": training_job_name,
+}
+
+sagemaker_create_model_config = {
+    "ModelName": model_name,
+    "Containers": [
+        {
+            "Image": f"{image_uri}",
+            "Mode": "SingleModel",
+            "ModelDataUrl": f"s3://{s3_bucket}/training/{training_job_name}/output/model.tar.gz",
+        }
+    ],
+    "ExecutionRoleArn": f"{role}",
+    "EnableNetworkIsolation": False,
+}
+
+sagemaker_inference_config = {
+    "TransformJobName": "sample_transform_job",
+    "ModelName": model_name,
+    "TransformInput": {
+        "DataSource": {
+            "S3DataSource": {
+                "S3DataType": "S3Prefix",
+                "S3Uri": f"s3://{s3_bucket}/config/config_date.yml",
+            }
+        },
+        "ContentType": "application/x-yaml",
+        "CompressionType": "None",
+        "SplitType": "None",
+    },
+    "TransformOutput": {"S3OutputPath": f"s3://{s3_bucket}/inferencing/output/"},
+    "TransformResources": {"InstanceType": "ml.m5.large", "InstanceCount": 1},
+}
+
+# [START howto_operator_sagemaker]
+with DAG(
+    "sample_sagemaker_dag",
+    schedule_interval=None,
+    start_date=datetime(2022, 2, 21),
+    catchup=False,
+) as dag:
+    sagemaker_processing_task = SageMakerProcessingOperator(
+        config=sagemaker_processing_job_config,
+        aws_conn_id="aws_default",

Review comment:
       Out of scope for this PR, but the Operators not having a default value for `aws_conn_id` is a pain, we should fix that.

##########
File path: tests/providers/amazon/aws/operators/test_sagemaker_model.py
##########
@@ -63,3 +66,17 @@ def test_execute_with_failure(self, mock_model, mock_client):
         mock_model.return_value = {'ModelArn': 'testarn', 'ResponseMetadata': {'HTTPStatusCode': 404}}
         with pytest.raises(AirflowException):
             self.sagemaker.execute(None)
+
+
+class TestSageMakerDeleteModelOperator(unittest.TestCase):
+    def setUp(self):
+        self.sagemaker = SageMakerDeleteModelOperator(
+            task_id='test_sagemaker_operator', aws_conn_id='sagemaker_test_id', model_name='test'
+        )
+
+    @mock.patch.object(SageMakerHook, 'get_conn')
+    @mock.patch.object(SageMakerHook, 'delete_model')
+    def test_model_delete(self, mock_model, mock_client):

Review comment:
       Nipick, take it or leave it, but I'd rename `mock_model` to `mock_delete_model` for clarity, but I suppose that's personal preference.

##########
File path: airflow/providers/amazon/aws/hooks/sagemaker.py
##########
@@ -908,3 +908,19 @@ def find_processing_job_by_name(self, processing_job_name: str) -> bool:
             if e.response['Error']['Code'] in ['ValidationException', 'ResourceNotFound']:
                 return False
             raise
+
+    def delete_model(self, model_name: str):
+        """Delete Sagemaker model
+        :param model_name: (optional) name of the model
+        :return: True if Model exists and deleted else return False

Review comment:
       In general I try to keep hook behavior as close to the underlying boto functionality as possible.  Is there a particular reason you want to return True or False here when the boto call itself does not return a value?

##########
File path: airflow/providers/amazon/aws/example_dags/example_sagemaker.py
##########
@@ -0,0 +1,179 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from os import environ
+
+from airflow import DAG
+from airflow.providers.amazon.aws.operators.sagemaker import (
+    SageMakerDeleteModelOperator,
+    SageMakerModelOperator,
+    SageMakerProcessingOperator,
+    SageMakerTrainingOperator,
+    SageMakerTransformOperator,
+)
+
+model_name = "sample_model"
+training_job_name = 'sample_training'

Review comment:
       Here and elsewhere:  Single quotes or double quotes, please pick one.  Also, all of these variables defined before the DAG context are effectively globals/constants and names should be in all caps, such as `MODEL_NAME`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r814402742



##########
File path: airflow/providers/amazon/aws/example_dags/example_sagemaker.py
##########
@@ -0,0 +1,179 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from os import environ
+
+from airflow import DAG
+from airflow.providers.amazon.aws.operators.sagemaker import (
+    SageMakerDeleteModelOperator,
+    SageMakerModelOperator,
+    SageMakerProcessingOperator,
+    SageMakerTrainingOperator,
+    SageMakerTransformOperator,
+)
+
+model_name = "sample_model"
+training_job_name = 'sample_training'
+image_uri = environ.get('ECR_IMAGE_URI', '123456789012.dkr.ecr.us-east-1.amazonaws.com/repo_name')
+s3_bucket = environ.get('BUCKET_NAME', 'test-airflow-12345')
+role = environ.get('SAGEMAKER_ROLE_ARN', 'arn:aws:iam::123456789012:role/role_name')
+
+sagemaker_processing_job_config = {
+    "ProcessingJobName": "sample_processing_job",
+    "ProcessingInputs": [
+        {
+            "InputName": "input",
+            "AppManaged": False,
+            "S3Input": {
+                "S3Uri": f"s3://{s3_bucket}/preprocessing/input/",
+                "LocalPath": "/opt/ml/processing/input/",
+                "S3DataType": "S3Prefix",
+                "S3InputMode": "File",
+                "S3DataDistributionType": "FullyReplicated",
+                "S3CompressionType": "None",
+            },
+        },
+    ],
+    "ProcessingOutputConfig": {
+        "Outputs": [
+            {
+                "OutputName": "output",
+                "S3Output": {
+                    "S3Uri": f"s3://{s3_bucket}/preprocessing/output/",
+                    "LocalPath": "/opt/ml/processing/output/",
+                    "S3UploadMode": "EndOfJob",
+                },
+                "AppManaged": False,
+            }
+        ]
+    },
+    "ProcessingResources": {
+        "ClusterConfig": {
+            "InstanceCount": 1,
+            "InstanceType": "ml.m5.large",
+            "VolumeSizeInGB": 5,
+        }
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 3600},
+    "AppSpecification": {
+        "ImageUri": f"{image_uri}",
+        "ContainerEntrypoint": ["python3", "./preprocessing.py"],
+    },
+    "RoleArn": f"{role}",
+}
+
+sagemaker_training_job_config = {
+    "AlgorithmSpecification": {
+        "TrainingImage": f"{image_uri}",
+        "TrainingInputMode": "File",
+    },
+    "InputDataConfig": [
+        {
+            "ChannelName": "config",
+            "DataSource": {
+                "S3DataSource": {
+                    "S3DataType": "S3Prefix",
+                    "S3Uri": f"s3://{s3_bucket}/config/",
+                    "S3DataDistributionType": "FullyReplicated",
+                }
+            },
+            "CompressionType": "None",
+            "RecordWrapperType": "None",
+        },
+    ],
+    "OutputDataConfig": {
+        "KmsKeyId": "",
+        "S3OutputPath": f"s3://{s3_bucket}/training/",
+    },
+    "ResourceConfig": {
+        "InstanceType": "ml.m5.large",
+        "InstanceCount": 1,
+        "VolumeSizeInGB": 5,
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 6000},
+    "RoleArn": f"{role}",
+    "EnableNetworkIsolation": False,
+    "EnableInterContainerTrafficEncryption": False,
+    "EnableManagedSpotTraining": False,
+    "TrainingJobName": training_job_name,
+}
+
+sagemaker_create_model_config = {
+    "ModelName": model_name,
+    "Containers": [
+        {
+            "Image": f"{image_uri}",
+            "Mode": "SingleModel",
+            "ModelDataUrl": f"s3://{s3_bucket}/training/{training_job_name}/output/model.tar.gz",
+        }
+    ],
+    "ExecutionRoleArn": f"{role}",
+    "EnableNetworkIsolation": False,
+}
+
+sagemaker_inference_config = {
+    "TransformJobName": "sample_transform_job",
+    "ModelName": model_name,
+    "TransformInput": {
+        "DataSource": {
+            "S3DataSource": {
+                "S3DataType": "S3Prefix",
+                "S3Uri": f"s3://{s3_bucket}/config/config_date.yml",
+            }
+        },
+        "ContentType": "application/x-yaml",
+        "CompressionType": "None",
+        "SplitType": "None",
+    },
+    "TransformOutput": {"S3OutputPath": f"s3://{s3_bucket}/inferencing/output/"},
+    "TransformResources": {"InstanceType": "ml.m5.large", "InstanceCount": 1},
+}
+
+# [START howto_operator_sagemaker]
+with DAG(
+    "sample_sagemaker_dag",
+    schedule_interval=None,
+    start_date=datetime(2022, 2, 21),
+    catchup=False,
+) as dag:
+    sagemaker_processing_task = SageMakerProcessingOperator(
+        config=sagemaker_processing_job_config,
+        aws_conn_id="aws_default",

Review comment:
       Already open an issue request. Thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks commented on pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks commented on pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#issuecomment-1051129536


   > If in doubt rebase. In this case it will help as it has been fixed on main.
   
   Thanks @ashb ! I already did it twice before but let me do it again :) Thanks for the help!!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r815423440



##########
File path: docs/apache-airflow-providers-amazon/operators/sagemaker.rst
##########
@@ -0,0 +1,72 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon SageMaker Operators
+========================================
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon SageMaker integration provides several operators to create and interact with
+SageMaker Jobs.
+
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerDeleteModelOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerModelOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerProcessingOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerTrainingOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerTransformOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerTuningOperator`
+
+One example_dag is provided which showcases some of these operators in action.
+
+ - example_sagemaker.py
+
+--------------------------------------------------------------

Review comment:
       sure @eladkal just testing local doc build once. Will push changes in few mins. Thanks a lot for review :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r815422347



##########
File path: airflow/providers/amazon/aws/example_dags/example_sagemaker.py
##########
@@ -0,0 +1,177 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from os import environ
+
+from airflow import DAG
+from airflow.providers.amazon.aws.operators.sagemaker import (
+    SageMakerDeleteModelOperator,
+    SageMakerModelOperator,
+    SageMakerProcessingOperator,
+    SageMakerTrainingOperator,
+    SageMakerTransformOperator,
+)
+
+MODEL_NAME = "sample_model"
+TRAINING_JOB_NAME = "sample_training"
+IMAGE_URI = environ.get("ECR_IMAGE_URI", "123456789012.dkr.ecr.us-east-1.amazonaws.com/repo_name")
+S3_BUCKET = environ.get("BUCKET_NAME", "test-airflow-12345")
+ROLE = environ.get("SAGEMAKER_ROLE_ARN", "arn:aws:iam::123456789012:role/role_name")
+
+SAGEMAKER_PROCESSING_JOB_CONFIG = {
+    "ProcessingJobName": "sample_processing_job",
+    "ProcessingInputs": [
+        {
+            "InputName": "input",
+            "AppManaged": False,
+            "S3Input": {
+                "S3Uri": f"s3://{S3_BUCKET}/preprocessing/input/",
+                "LocalPath": "/opt/ml/processing/input/",
+                "S3DataType": "S3Prefix",
+                "S3InputMode": "File",
+                "S3DataDistributionType": "FullyReplicated",
+                "S3CompressionType": "None",
+            },
+        },
+    ],
+    "ProcessingOutputConfig": {
+        "Outputs": [
+            {
+                "OutputName": "output",
+                "S3Output": {
+                    "S3Uri": f"s3://{S3_BUCKET}/preprocessing/output/",
+                    "LocalPath": "/opt/ml/processing/output/",
+                    "S3UploadMode": "EndOfJob",
+                },
+                "AppManaged": False,
+            }
+        ]
+    },
+    "ProcessingResources": {
+        "ClusterConfig": {
+            "InstanceCount": 1,
+            "InstanceType": "ml.m5.large",
+            "VolumeSizeInGB": 5,
+        }
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 3600},
+    "AppSpecification": {
+        "ImageUri": f"{IMAGE_URI}",
+        "ContainerEntrypoint": ["python3", "./preprocessing.py"],
+    },
+    "RoleArn": f"{ROLE}",
+}
+
+SAGEMAKER_TRAINING_JOB_CONFIG = {
+    "AlgorithmSpecification": {
+        "TrainingImage": f"{IMAGE_URI}",
+        "TrainingInputMode": "File",
+    },
+    "InputDataConfig": [
+        {
+            "ChannelName": "config",
+            "DataSource": {
+                "S3DataSource": {
+                    "S3DataType": "S3Prefix",
+                    "S3Uri": f"s3://{S3_BUCKET}/config/",
+                    "S3DataDistributionType": "FullyReplicated",
+                }
+            },
+            "CompressionType": "None",
+            "RecordWrapperType": "None",
+        },
+    ],
+    "OutputDataConfig": {
+        "KmsKeyId": "",
+        "S3OutputPath": f"s3://{S3_BUCKET}/training/",
+    },
+    "ResourceConfig": {
+        "InstanceType": "ml.m5.large",
+        "InstanceCount": 1,
+        "VolumeSizeInGB": 5,
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 6000},
+    "RoleArn": f"{ROLE}",
+    "EnableNetworkIsolation": False,
+    "EnableInterContainerTrafficEncryption": False,
+    "EnableManagedSpotTraining": False,
+    "TrainingJobName": TRAINING_JOB_NAME,
+}
+
+SAGEMAKER_CREATE_MODEL_CONFIG = {
+    "ModelName": MODEL_NAME,
+    "Containers": [
+        {
+            "Image": f"{IMAGE_URI}",
+            "Mode": "SingleModel",
+            "ModelDataUrl": f"s3://{S3_BUCKET}/training/{TRAINING_JOB_NAME}/output/model.tar.gz",
+        }
+    ],
+    "ExecutionRoleArn": f"{ROLE}",
+    "EnableNetworkIsolation": False,
+}
+
+SAGEMAKER_INFERENCE_CONFIG = {
+    "TransformJobName": "sample_transform_job",
+    "ModelName": MODEL_NAME,
+    "TransformInput": {
+        "DataSource": {
+            "S3DataSource": {
+                "S3DataType": "S3Prefix",
+                "S3Uri": f"s3://{S3_BUCKET}/config/config_date.yml",
+            }
+        },
+        "ContentType": "application/x-yaml",
+        "CompressionType": "None",
+        "SplitType": "None",
+    },
+    "TransformOutput": {"S3OutputPath": f"s3://{S3_BUCKET}/inferencing/output/"},
+    "TransformResources": {"InstanceType": "ml.m5.large", "InstanceCount": 1},
+}
+
+# [START howto_operator_sagemaker]
+with DAG(
+    "sample_sagemaker_dag",
+    schedule_interval=None,
+    start_date=datetime(2022, 2, 21),
+    catchup=False,
+) as dag:
+    sagemaker_processing_task = SageMakerProcessingOperator(
+        config=SAGEMAKER_PROCESSING_JOB_CONFIG,
+        aws_conn_id="aws_default",
+        task_id="sagemaker_preprocessing_task",
+    )
+
+    training_task = SageMakerTrainingOperator(
+        config=SAGEMAKER_TRAINING_JOB_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_training_task"
+    )
+
+    model_create_task = SageMakerModelOperator(
+        config=SAGEMAKER_CREATE_MODEL_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_create_model_task"
+    )
+
+    inference_task = SageMakerTransformOperator(
+        config=SAGEMAKER_INFERENCE_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_inference_task"
+    )
+
+    model_delete_task = SageMakerDeleteModelOperator(
+        task_id="sagemaker_delete_model_task", config={'ModelName': MODEL_NAME}, aws_conn_id="aws_default"
+    )

Review comment:
       SageMakerModelOperator will create the model as part of SAGEMAKER_CREATE_MODEL_CONFIG. Let me know if you still have any question. If not please mark it as resolve. Thanks :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r814039727



##########
File path: docs/apache-airflow-providers-amazon/operators/sagemaker.rst
##########
@@ -0,0 +1,71 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Sagemaker Operators

Review comment:
       Sure corrected it. Thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r814399480



##########
File path: airflow/providers/amazon/aws/example_dags/example_sagemaker.py
##########
@@ -0,0 +1,177 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from os import environ
+
+from airflow import DAG
+from airflow.providers.amazon.aws.operators.sagemaker import (
+    SageMakerDeleteModelOperator,
+    SageMakerModelOperator,
+    SageMakerProcessingOperator,
+    SageMakerTrainingOperator,
+    SageMakerTransformOperator,
+)
+
+MODEL_NAME = "sample_model"
+TRAINING_JOB_NAME = "sample_training"
+IMAGE_URI = environ.get("ECR_IMAGE_URI", "123456789012.dkr.ecr.us-east-1.amazonaws.com/repo_name")
+S3_BUCKET = environ.get("BUCKET_NAME", "test-airflow-12345")
+ROLE = environ.get("SAGEMAKER_ROLE_ARN", "arn:aws:iam::123456789012:role/role_name")
+
+SAGEMAKER_PROCESSING_JOB_CONFIG = {
+    "ProcessingJobName": "sample_processing_job",
+    "ProcessingInputs": [
+        {
+            "InputName": "input",
+            "AppManaged": False,
+            "S3Input": {
+                "S3Uri": f"s3://{S3_BUCKET}/preprocessing/input/",
+                "LocalPath": "/opt/ml/processing/input/",
+                "S3DataType": "S3Prefix",
+                "S3InputMode": "File",
+                "S3DataDistributionType": "FullyReplicated",
+                "S3CompressionType": "None",
+            },
+        },
+    ],
+    "ProcessingOutputConfig": {
+        "Outputs": [
+            {
+                "OutputName": "output",
+                "S3Output": {
+                    "S3Uri": f"s3://{S3_BUCKET}/preprocessing/output/",
+                    "LocalPath": "/opt/ml/processing/output/",
+                    "S3UploadMode": "EndOfJob",
+                },
+                "AppManaged": False,
+            }
+        ]
+    },
+    "ProcessingResources": {
+        "ClusterConfig": {
+            "InstanceCount": 1,
+            "InstanceType": "ml.m5.large",
+            "VolumeSizeInGB": 5,
+        }
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 3600},
+    "AppSpecification": {
+        "ImageUri": f"{IMAGE_URI}",
+        "ContainerEntrypoint": ["python3", "./preprocessing.py"],
+    },
+    "RoleArn": f"{ROLE}",
+}
+
+SAGEMAKER_TRAINING_JOB_CONFIG = {
+    "AlgorithmSpecification": {
+        "TrainingImage": f"{IMAGE_URI}",
+        "TrainingInputMode": "File",
+    },
+    "InputDataConfig": [
+        {
+            "ChannelName": "config",
+            "DataSource": {
+                "S3DataSource": {
+                    "S3DataType": "S3Prefix",
+                    "S3Uri": f"s3://{S3_BUCKET}/config/",
+                    "S3DataDistributionType": "FullyReplicated",
+                }
+            },
+            "CompressionType": "None",
+            "RecordWrapperType": "None",
+        },
+    ],
+    "OutputDataConfig": {
+        "KmsKeyId": "",
+        "S3OutputPath": f"s3://{S3_BUCKET}/training/",
+    },
+    "ResourceConfig": {
+        "InstanceType": "ml.m5.large",
+        "InstanceCount": 1,
+        "VolumeSizeInGB": 5,
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 6000},
+    "RoleArn": f"{ROLE}",
+    "EnableNetworkIsolation": False,
+    "EnableInterContainerTrafficEncryption": False,
+    "EnableManagedSpotTraining": False,
+    "TrainingJobName": TRAINING_JOB_NAME,
+}
+
+SAGEMAKER_CREATE_MODEL_CONFIG = {
+    "ModelName": MODEL_NAME,
+    "Containers": [
+        {
+            "Image": f"{IMAGE_URI}",
+            "Mode": "SingleModel",
+            "ModelDataUrl": f"s3://{S3_BUCKET}/training/{TRAINING_JOB_NAME}/output/model.tar.gz",
+        }
+    ],
+    "ExecutionRoleArn": f"{ROLE}",
+    "EnableNetworkIsolation": False,
+}
+
+SAGEMAKER_INFERENCE_CONFIG = {
+    "TransformJobName": "sample_transform_job",
+    "ModelName": MODEL_NAME,
+    "TransformInput": {
+        "DataSource": {
+            "S3DataSource": {
+                "S3DataType": "S3Prefix",
+                "S3Uri": f"s3://{S3_BUCKET}/config/config_date.yml",
+            }
+        },
+        "ContentType": "application/x-yaml",
+        "CompressionType": "None",
+        "SplitType": "None",
+    },
+    "TransformOutput": {"S3OutputPath": f"s3://{S3_BUCKET}/inferencing/output/"},
+    "TransformResources": {"InstanceType": "ml.m5.large", "InstanceCount": 1},
+}
+
+# [START howto_operator_sagemaker]
+with DAG(
+    "sample_sagemaker_dag",
+    schedule_interval=None,
+    start_date=datetime(2022, 2, 21),
+    catchup=False,
+) as dag:
+    sagemaker_processing_task = SageMakerProcessingOperator(
+        config=SAGEMAKER_PROCESSING_JOB_CONFIG,
+        aws_conn_id="aws_default",
+        task_id="sagemaker_preprocessing_task",
+    )
+
+    training_task = SageMakerTrainingOperator(
+        config=SAGEMAKER_TRAINING_JOB_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_training_task"
+    )
+
+    model_create_task = SageMakerModelOperator(
+        config=SAGEMAKER_CREATE_MODEL_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_create_model_task"
+    )
+
+    inference_task = SageMakerTransformOperator(
+        config=SAGEMAKER_INFERENCE_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_inference_task"
+    )
+
+    model_delete_task = SageMakerDeleteModelOperator(
+        task_id="sagemaker_delete_model_task", config={'ModelName': MODEL_NAME}, aws_conn_id="aws_default"
+    )
+
+    sagemaker_processing_task >> training_task >> model_create_task >> inference_task >> model_delete_task
+    # [END howto_operator_sagemaker]

Review comment:
       > Curious if `./breeze build-docs -- --package-filter apache-airflow-providers-amazon` passes locally. Specifically, the indent level on the START and END tags here are different. Not sure if that is an issue or not.
   
   Thanks a lot for review.! Yes the check passed locally for me :) 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r814042387



##########
File path: tests/providers/amazon/aws/operators/test_sagemaker_model.py
##########
@@ -63,3 +66,17 @@ def test_execute_with_failure(self, mock_model, mock_client):
         mock_model.return_value = {'ModelArn': 'testarn', 'ResponseMetadata': {'HTTPStatusCode': 404}}
         with pytest.raises(AirflowException):
             self.sagemaker.execute(None)
+
+
+class TestSageMakerDeleteModelOperator(unittest.TestCase):
+    def setUp(self):
+        self.sagemaker = SageMakerDeleteModelOperator(
+            task_id='test_sagemaker_operator', aws_conn_id='sagemaker_test_id', model_name='test'
+        )
+
+    @mock.patch.object(SageMakerHook, 'get_conn')
+    @mock.patch.object(SageMakerHook, 'delete_model')
+    def test_model_delete(self, mock_model, mock_client):

Review comment:
       Updated the test cases , so please verify them based on the suggested approach.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
eladkal commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r815423157



##########
File path: airflow/providers/amazon/aws/example_dags/example_sagemaker.py
##########
@@ -0,0 +1,177 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from os import environ
+
+from airflow import DAG
+from airflow.providers.amazon.aws.operators.sagemaker import (
+    SageMakerDeleteModelOperator,
+    SageMakerModelOperator,
+    SageMakerProcessingOperator,
+    SageMakerTrainingOperator,
+    SageMakerTransformOperator,
+)
+
+MODEL_NAME = "sample_model"
+TRAINING_JOB_NAME = "sample_training"
+IMAGE_URI = environ.get("ECR_IMAGE_URI", "123456789012.dkr.ecr.us-east-1.amazonaws.com/repo_name")
+S3_BUCKET = environ.get("BUCKET_NAME", "test-airflow-12345")
+ROLE = environ.get("SAGEMAKER_ROLE_ARN", "arn:aws:iam::123456789012:role/role_name")
+
+SAGEMAKER_PROCESSING_JOB_CONFIG = {
+    "ProcessingJobName": "sample_processing_job",
+    "ProcessingInputs": [
+        {
+            "InputName": "input",
+            "AppManaged": False,
+            "S3Input": {
+                "S3Uri": f"s3://{S3_BUCKET}/preprocessing/input/",
+                "LocalPath": "/opt/ml/processing/input/",
+                "S3DataType": "S3Prefix",
+                "S3InputMode": "File",
+                "S3DataDistributionType": "FullyReplicated",
+                "S3CompressionType": "None",
+            },
+        },
+    ],
+    "ProcessingOutputConfig": {
+        "Outputs": [
+            {
+                "OutputName": "output",
+                "S3Output": {
+                    "S3Uri": f"s3://{S3_BUCKET}/preprocessing/output/",
+                    "LocalPath": "/opt/ml/processing/output/",
+                    "S3UploadMode": "EndOfJob",
+                },
+                "AppManaged": False,
+            }
+        ]
+    },
+    "ProcessingResources": {
+        "ClusterConfig": {
+            "InstanceCount": 1,
+            "InstanceType": "ml.m5.large",
+            "VolumeSizeInGB": 5,
+        }
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 3600},
+    "AppSpecification": {
+        "ImageUri": f"{IMAGE_URI}",
+        "ContainerEntrypoint": ["python3", "./preprocessing.py"],
+    },
+    "RoleArn": f"{ROLE}",
+}
+
+SAGEMAKER_TRAINING_JOB_CONFIG = {
+    "AlgorithmSpecification": {
+        "TrainingImage": f"{IMAGE_URI}",
+        "TrainingInputMode": "File",
+    },
+    "InputDataConfig": [
+        {
+            "ChannelName": "config",
+            "DataSource": {
+                "S3DataSource": {
+                    "S3DataType": "S3Prefix",
+                    "S3Uri": f"s3://{S3_BUCKET}/config/",
+                    "S3DataDistributionType": "FullyReplicated",
+                }
+            },
+            "CompressionType": "None",
+            "RecordWrapperType": "None",
+        },
+    ],
+    "OutputDataConfig": {
+        "KmsKeyId": "",
+        "S3OutputPath": f"s3://{S3_BUCKET}/training/",
+    },
+    "ResourceConfig": {
+        "InstanceType": "ml.m5.large",
+        "InstanceCount": 1,
+        "VolumeSizeInGB": 5,
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 6000},
+    "RoleArn": f"{ROLE}",
+    "EnableNetworkIsolation": False,
+    "EnableInterContainerTrafficEncryption": False,
+    "EnableManagedSpotTraining": False,
+    "TrainingJobName": TRAINING_JOB_NAME,
+}
+
+SAGEMAKER_CREATE_MODEL_CONFIG = {
+    "ModelName": MODEL_NAME,
+    "Containers": [
+        {
+            "Image": f"{IMAGE_URI}",
+            "Mode": "SingleModel",
+            "ModelDataUrl": f"s3://{S3_BUCKET}/training/{TRAINING_JOB_NAME}/output/model.tar.gz",
+        }
+    ],
+    "ExecutionRoleArn": f"{ROLE}",
+    "EnableNetworkIsolation": False,
+}
+
+SAGEMAKER_INFERENCE_CONFIG = {
+    "TransformJobName": "sample_transform_job",
+    "ModelName": MODEL_NAME,
+    "TransformInput": {
+        "DataSource": {
+            "S3DataSource": {
+                "S3DataType": "S3Prefix",
+                "S3Uri": f"s3://{S3_BUCKET}/config/config_date.yml",
+            }
+        },
+        "ContentType": "application/x-yaml",
+        "CompressionType": "None",
+        "SplitType": "None",
+    },
+    "TransformOutput": {"S3OutputPath": f"s3://{S3_BUCKET}/inferencing/output/"},
+    "TransformResources": {"InstanceType": "ml.m5.large", "InstanceCount": 1},
+}
+
+# [START howto_operator_sagemaker]
+with DAG(
+    "sample_sagemaker_dag",
+    schedule_interval=None,
+    start_date=datetime(2022, 2, 21),
+    catchup=False,
+) as dag:
+    sagemaker_processing_task = SageMakerProcessingOperator(
+        config=SAGEMAKER_PROCESSING_JOB_CONFIG,
+        aws_conn_id="aws_default",
+        task_id="sagemaker_preprocessing_task",
+    )
+
+    training_task = SageMakerTrainingOperator(
+        config=SAGEMAKER_TRAINING_JOB_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_training_task"
+    )
+
+    model_create_task = SageMakerModelOperator(
+        config=SAGEMAKER_CREATE_MODEL_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_create_model_task"
+    )
+
+    inference_task = SageMakerTransformOperator(
+        config=SAGEMAKER_INFERENCE_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_inference_task"
+    )
+
+    model_delete_task = SageMakerDeleteModelOperator(
+        task_id="sagemaker_delete_model_task", config={'ModelName': MODEL_NAME}, aws_conn_id="aws_default"
+    )

Review comment:
       Ah OK now I see it. It uses `SAGEMAKER_CREATE_MODEL_CONFIG` which has the sample_model name.

##########
File path: airflow/providers/amazon/aws/example_dags/example_sagemaker.py
##########
@@ -0,0 +1,177 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from os import environ
+
+from airflow import DAG
+from airflow.providers.amazon.aws.operators.sagemaker import (
+    SageMakerDeleteModelOperator,
+    SageMakerModelOperator,
+    SageMakerProcessingOperator,
+    SageMakerTrainingOperator,
+    SageMakerTransformOperator,
+)
+
+MODEL_NAME = "sample_model"
+TRAINING_JOB_NAME = "sample_training"
+IMAGE_URI = environ.get("ECR_IMAGE_URI", "123456789012.dkr.ecr.us-east-1.amazonaws.com/repo_name")
+S3_BUCKET = environ.get("BUCKET_NAME", "test-airflow-12345")
+ROLE = environ.get("SAGEMAKER_ROLE_ARN", "arn:aws:iam::123456789012:role/role_name")
+
+SAGEMAKER_PROCESSING_JOB_CONFIG = {
+    "ProcessingJobName": "sample_processing_job",
+    "ProcessingInputs": [
+        {
+            "InputName": "input",
+            "AppManaged": False,
+            "S3Input": {
+                "S3Uri": f"s3://{S3_BUCKET}/preprocessing/input/",
+                "LocalPath": "/opt/ml/processing/input/",
+                "S3DataType": "S3Prefix",
+                "S3InputMode": "File",
+                "S3DataDistributionType": "FullyReplicated",
+                "S3CompressionType": "None",
+            },
+        },
+    ],
+    "ProcessingOutputConfig": {
+        "Outputs": [
+            {
+                "OutputName": "output",
+                "S3Output": {
+                    "S3Uri": f"s3://{S3_BUCKET}/preprocessing/output/",
+                    "LocalPath": "/opt/ml/processing/output/",
+                    "S3UploadMode": "EndOfJob",
+                },
+                "AppManaged": False,
+            }
+        ]
+    },
+    "ProcessingResources": {
+        "ClusterConfig": {
+            "InstanceCount": 1,
+            "InstanceType": "ml.m5.large",
+            "VolumeSizeInGB": 5,
+        }
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 3600},
+    "AppSpecification": {
+        "ImageUri": f"{IMAGE_URI}",
+        "ContainerEntrypoint": ["python3", "./preprocessing.py"],
+    },
+    "RoleArn": f"{ROLE}",
+}
+
+SAGEMAKER_TRAINING_JOB_CONFIG = {
+    "AlgorithmSpecification": {
+        "TrainingImage": f"{IMAGE_URI}",
+        "TrainingInputMode": "File",
+    },
+    "InputDataConfig": [
+        {
+            "ChannelName": "config",
+            "DataSource": {
+                "S3DataSource": {
+                    "S3DataType": "S3Prefix",
+                    "S3Uri": f"s3://{S3_BUCKET}/config/",
+                    "S3DataDistributionType": "FullyReplicated",
+                }
+            },
+            "CompressionType": "None",
+            "RecordWrapperType": "None",
+        },
+    ],
+    "OutputDataConfig": {
+        "KmsKeyId": "",
+        "S3OutputPath": f"s3://{S3_BUCKET}/training/",
+    },
+    "ResourceConfig": {
+        "InstanceType": "ml.m5.large",
+        "InstanceCount": 1,
+        "VolumeSizeInGB": 5,
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 6000},
+    "RoleArn": f"{ROLE}",
+    "EnableNetworkIsolation": False,
+    "EnableInterContainerTrafficEncryption": False,
+    "EnableManagedSpotTraining": False,
+    "TrainingJobName": TRAINING_JOB_NAME,
+}
+
+SAGEMAKER_CREATE_MODEL_CONFIG = {
+    "ModelName": MODEL_NAME,
+    "Containers": [
+        {
+            "Image": f"{IMAGE_URI}",
+            "Mode": "SingleModel",
+            "ModelDataUrl": f"s3://{S3_BUCKET}/training/{TRAINING_JOB_NAME}/output/model.tar.gz",
+        }
+    ],
+    "ExecutionRoleArn": f"{ROLE}",
+    "EnableNetworkIsolation": False,
+}
+
+SAGEMAKER_INFERENCE_CONFIG = {
+    "TransformJobName": "sample_transform_job",
+    "ModelName": MODEL_NAME,
+    "TransformInput": {
+        "DataSource": {
+            "S3DataSource": {
+                "S3DataType": "S3Prefix",
+                "S3Uri": f"s3://{S3_BUCKET}/config/config_date.yml",
+            }
+        },
+        "ContentType": "application/x-yaml",
+        "CompressionType": "None",
+        "SplitType": "None",
+    },
+    "TransformOutput": {"S3OutputPath": f"s3://{S3_BUCKET}/inferencing/output/"},
+    "TransformResources": {"InstanceType": "ml.m5.large", "InstanceCount": 1},
+}
+
+# [START howto_operator_sagemaker]
+with DAG(
+    "sample_sagemaker_dag",
+    schedule_interval=None,
+    start_date=datetime(2022, 2, 21),
+    catchup=False,
+) as dag:
+    sagemaker_processing_task = SageMakerProcessingOperator(
+        config=SAGEMAKER_PROCESSING_JOB_CONFIG,
+        aws_conn_id="aws_default",
+        task_id="sagemaker_preprocessing_task",
+    )
+
+    training_task = SageMakerTrainingOperator(
+        config=SAGEMAKER_TRAINING_JOB_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_training_task"
+    )
+
+    model_create_task = SageMakerModelOperator(
+        config=SAGEMAKER_CREATE_MODEL_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_create_model_task"
+    )
+
+    inference_task = SageMakerTransformOperator(
+        config=SAGEMAKER_INFERENCE_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_inference_task"
+    )
+
+    model_delete_task = SageMakerDeleteModelOperator(
+        task_id="sagemaker_delete_model_task", config={'ModelName': MODEL_NAME}, aws_conn_id="aws_default"
+    )

Review comment:
       Ah OK now I see it. It uses `SAGEMAKER_CREATE_MODEL_CONFIG` which has the `sample_model` name.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
eladkal commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r815423238



##########
File path: docs/apache-airflow-providers-amazon/operators/sagemaker.rst
##########
@@ -0,0 +1,72 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon SageMaker Operators
+========================================
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon SageMaker integration provides several operators to create and interact with
+SageMaker Jobs.
+
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerDeleteModelOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerModelOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerProcessingOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerTrainingOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerTransformOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerTuningOperator`
+
+One example_dag is provided which showcases some of these operators in action.
+
+ - example_sagemaker.py
+
+--------------------------------------------------------------

Review comment:
       @hsrocks so i believe addressing this point is the only comment left from my side




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
eladkal commented on pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#issuecomment-1045980385


   Can you please update docs and add example dag?
   https://github.com/apache/airflow/tree/main/airflow/providers/amazon/aws/example_dags
   https://github.com/apache/airflow/tree/main/docs/apache-airflow-providers-amazon/operators


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#issuecomment-1046035627


   > > Can you please add docs and example dag? https://github.com/apache/airflow/tree/main/airflow/providers/amazon/aws/example_dags https://github.com/apache/airflow/tree/main/docs/apache-airflow-providers-amazon/operators
   > 
   > Sure! Valid point @eladkal . Can see its missing for sagemaker altogether. Will do it . The failed test ' Tests / MySQL5.7, Py3.7: Always Integration Providers (pull_request) ' is not related to changes. Can you please suggest @eladkal ?
   
   It could be flaky failure. Try again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal edited a comment on pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
eladkal edited a comment on pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#issuecomment-1045980385


   Can you please add docs and example dag?
   https://github.com/apache/airflow/tree/main/airflow/providers/amazon/aws/example_dags
   https://github.com/apache/airflow/tree/main/docs/apache-airflow-providers-amazon/operators


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi edited a comment on pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
ferruzzi edited a comment on pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#issuecomment-1051112684


   ![image](https://user-images.githubusercontent.com/1920178/155771942-e64c26d0-09ad-4a39-ab61-e12563be3d8c.png)
   @hsrocks The failing test doesn't appear to be related.  The file wasn't touched by this PR and the line that is failing hasn't been touched in months, but @ashb was in that file yesterday and may have more info.  Looks like a bunch of PRs are red, so may be a known issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r814335853



##########
File path: airflow/providers/amazon/aws/example_dags/example_sagemaker.py
##########
@@ -0,0 +1,177 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from os import environ
+
+from airflow import DAG
+from airflow.providers.amazon.aws.operators.sagemaker import (
+    SageMakerDeleteModelOperator,
+    SageMakerModelOperator,
+    SageMakerProcessingOperator,
+    SageMakerTrainingOperator,
+    SageMakerTransformOperator,
+)
+
+MODEL_NAME = "sample_model"
+TRAINING_JOB_NAME = "sample_training"
+IMAGE_URI = environ.get("ECR_IMAGE_URI", "123456789012.dkr.ecr.us-east-1.amazonaws.com/repo_name")
+S3_BUCKET = environ.get("BUCKET_NAME", "test-airflow-12345")
+ROLE = environ.get("SAGEMAKER_ROLE_ARN", "arn:aws:iam::123456789012:role/role_name")
+
+SAGEMAKER_PROCESSING_JOB_CONFIG = {
+    "ProcessingJobName": "sample_processing_job",
+    "ProcessingInputs": [
+        {
+            "InputName": "input",
+            "AppManaged": False,
+            "S3Input": {
+                "S3Uri": f"s3://{S3_BUCKET}/preprocessing/input/",
+                "LocalPath": "/opt/ml/processing/input/",
+                "S3DataType": "S3Prefix",
+                "S3InputMode": "File",
+                "S3DataDistributionType": "FullyReplicated",
+                "S3CompressionType": "None",
+            },
+        },
+    ],
+    "ProcessingOutputConfig": {
+        "Outputs": [
+            {
+                "OutputName": "output",
+                "S3Output": {
+                    "S3Uri": f"s3://{S3_BUCKET}/preprocessing/output/",
+                    "LocalPath": "/opt/ml/processing/output/",
+                    "S3UploadMode": "EndOfJob",
+                },
+                "AppManaged": False,
+            }
+        ]
+    },
+    "ProcessingResources": {
+        "ClusterConfig": {
+            "InstanceCount": 1,
+            "InstanceType": "ml.m5.large",
+            "VolumeSizeInGB": 5,
+        }
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 3600},
+    "AppSpecification": {
+        "ImageUri": f"{IMAGE_URI}",
+        "ContainerEntrypoint": ["python3", "./preprocessing.py"],
+    },
+    "RoleArn": f"{ROLE}",
+}
+
+SAGEMAKER_TRAINING_JOB_CONFIG = {
+    "AlgorithmSpecification": {
+        "TrainingImage": f"{IMAGE_URI}",
+        "TrainingInputMode": "File",
+    },
+    "InputDataConfig": [
+        {
+            "ChannelName": "config",
+            "DataSource": {
+                "S3DataSource": {
+                    "S3DataType": "S3Prefix",
+                    "S3Uri": f"s3://{S3_BUCKET}/config/",
+                    "S3DataDistributionType": "FullyReplicated",
+                }
+            },
+            "CompressionType": "None",
+            "RecordWrapperType": "None",
+        },
+    ],
+    "OutputDataConfig": {
+        "KmsKeyId": "",
+        "S3OutputPath": f"s3://{S3_BUCKET}/training/",
+    },
+    "ResourceConfig": {
+        "InstanceType": "ml.m5.large",
+        "InstanceCount": 1,
+        "VolumeSizeInGB": 5,
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 6000},
+    "RoleArn": f"{ROLE}",
+    "EnableNetworkIsolation": False,
+    "EnableInterContainerTrafficEncryption": False,
+    "EnableManagedSpotTraining": False,
+    "TrainingJobName": TRAINING_JOB_NAME,
+}
+
+SAGEMAKER_CREATE_MODEL_CONFIG = {
+    "ModelName": MODEL_NAME,
+    "Containers": [
+        {
+            "Image": f"{IMAGE_URI}",
+            "Mode": "SingleModel",
+            "ModelDataUrl": f"s3://{S3_BUCKET}/training/{TRAINING_JOB_NAME}/output/model.tar.gz",
+        }
+    ],
+    "ExecutionRoleArn": f"{ROLE}",
+    "EnableNetworkIsolation": False,
+}
+
+SAGEMAKER_INFERENCE_CONFIG = {
+    "TransformJobName": "sample_transform_job",
+    "ModelName": MODEL_NAME,
+    "TransformInput": {
+        "DataSource": {
+            "S3DataSource": {
+                "S3DataType": "S3Prefix",
+                "S3Uri": f"s3://{S3_BUCKET}/config/config_date.yml",
+            }
+        },
+        "ContentType": "application/x-yaml",
+        "CompressionType": "None",
+        "SplitType": "None",
+    },
+    "TransformOutput": {"S3OutputPath": f"s3://{S3_BUCKET}/inferencing/output/"},
+    "TransformResources": {"InstanceType": "ml.m5.large", "InstanceCount": 1},
+}
+
+# [START howto_operator_sagemaker]
+with DAG(
+    "sample_sagemaker_dag",
+    schedule_interval=None,
+    start_date=datetime(2022, 2, 21),
+    catchup=False,
+) as dag:
+    sagemaker_processing_task = SageMakerProcessingOperator(
+        config=SAGEMAKER_PROCESSING_JOB_CONFIG,
+        aws_conn_id="aws_default",
+        task_id="sagemaker_preprocessing_task",
+    )
+
+    training_task = SageMakerTrainingOperator(
+        config=SAGEMAKER_TRAINING_JOB_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_training_task"
+    )
+
+    model_create_task = SageMakerModelOperator(
+        config=SAGEMAKER_CREATE_MODEL_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_create_model_task"
+    )
+
+    inference_task = SageMakerTransformOperator(
+        config=SAGEMAKER_INFERENCE_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_inference_task"
+    )
+
+    model_delete_task = SageMakerDeleteModelOperator(
+        task_id="sagemaker_delete_model_task", config={'ModelName': MODEL_NAME}, aws_conn_id="aws_default"
+    )
+
+    sagemaker_processing_task >> training_task >> model_create_task >> inference_task >> model_delete_task
+    # [END howto_operator_sagemaker]

Review comment:
       Curious if `./breeze build-docs -- --package-filter apache-airflow-providers-amazon` passes locally.  Specifically, the indent level on the START and END tags here are different.  Not sure if that is an issue or not.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#issuecomment-1050345233


   Looks like my points have been addressed other than that new question about the indenting of the START/END flag in the sample DAG, which may or may not be a concern.  Thanks for the changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#issuecomment-1053539731


   Awesome work, congrats on your first merged pull request!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
eladkal commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r815418001



##########
File path: docs/apache-airflow-providers-amazon/operators/sagemaker.rst
##########
@@ -0,0 +1,72 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon SageMaker Operators
+========================================
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon SageMaker integration provides several operators to create and interact with
+SageMaker Jobs.
+
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerDeleteModelOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerModelOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerProcessingOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerTrainingOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerTransformOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerTuningOperator`
+
+One example_dag is provided which showcases some of these operators in action.
+
+ - example_sagemaker.py
+
+--------------------------------------------------------------

Review comment:
       There is no need to specify it.

##########
File path: airflow/providers/amazon/aws/example_dags/example_sagemaker.py
##########
@@ -0,0 +1,177 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from os import environ
+
+from airflow import DAG
+from airflow.providers.amazon.aws.operators.sagemaker import (
+    SageMakerDeleteModelOperator,
+    SageMakerModelOperator,
+    SageMakerProcessingOperator,
+    SageMakerTrainingOperator,
+    SageMakerTransformOperator,
+)
+
+MODEL_NAME = "sample_model"
+TRAINING_JOB_NAME = "sample_training"
+IMAGE_URI = environ.get("ECR_IMAGE_URI", "123456789012.dkr.ecr.us-east-1.amazonaws.com/repo_name")
+S3_BUCKET = environ.get("BUCKET_NAME", "test-airflow-12345")
+ROLE = environ.get("SAGEMAKER_ROLE_ARN", "arn:aws:iam::123456789012:role/role_name")
+
+SAGEMAKER_PROCESSING_JOB_CONFIG = {
+    "ProcessingJobName": "sample_processing_job",
+    "ProcessingInputs": [
+        {
+            "InputName": "input",
+            "AppManaged": False,
+            "S3Input": {
+                "S3Uri": f"s3://{S3_BUCKET}/preprocessing/input/",
+                "LocalPath": "/opt/ml/processing/input/",
+                "S3DataType": "S3Prefix",
+                "S3InputMode": "File",
+                "S3DataDistributionType": "FullyReplicated",
+                "S3CompressionType": "None",
+            },
+        },
+    ],
+    "ProcessingOutputConfig": {
+        "Outputs": [
+            {
+                "OutputName": "output",
+                "S3Output": {
+                    "S3Uri": f"s3://{S3_BUCKET}/preprocessing/output/",
+                    "LocalPath": "/opt/ml/processing/output/",
+                    "S3UploadMode": "EndOfJob",
+                },
+                "AppManaged": False,
+            }
+        ]
+    },
+    "ProcessingResources": {
+        "ClusterConfig": {
+            "InstanceCount": 1,
+            "InstanceType": "ml.m5.large",
+            "VolumeSizeInGB": 5,
+        }
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 3600},
+    "AppSpecification": {
+        "ImageUri": f"{IMAGE_URI}",
+        "ContainerEntrypoint": ["python3", "./preprocessing.py"],
+    },
+    "RoleArn": f"{ROLE}",
+}
+
+SAGEMAKER_TRAINING_JOB_CONFIG = {
+    "AlgorithmSpecification": {
+        "TrainingImage": f"{IMAGE_URI}",
+        "TrainingInputMode": "File",
+    },
+    "InputDataConfig": [
+        {
+            "ChannelName": "config",
+            "DataSource": {
+                "S3DataSource": {
+                    "S3DataType": "S3Prefix",
+                    "S3Uri": f"s3://{S3_BUCKET}/config/",
+                    "S3DataDistributionType": "FullyReplicated",
+                }
+            },
+            "CompressionType": "None",
+            "RecordWrapperType": "None",
+        },
+    ],
+    "OutputDataConfig": {
+        "KmsKeyId": "",
+        "S3OutputPath": f"s3://{S3_BUCKET}/training/",
+    },
+    "ResourceConfig": {
+        "InstanceType": "ml.m5.large",
+        "InstanceCount": 1,
+        "VolumeSizeInGB": 5,
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 6000},
+    "RoleArn": f"{ROLE}",
+    "EnableNetworkIsolation": False,
+    "EnableInterContainerTrafficEncryption": False,
+    "EnableManagedSpotTraining": False,
+    "TrainingJobName": TRAINING_JOB_NAME,
+}
+
+SAGEMAKER_CREATE_MODEL_CONFIG = {
+    "ModelName": MODEL_NAME,
+    "Containers": [
+        {
+            "Image": f"{IMAGE_URI}",
+            "Mode": "SingleModel",
+            "ModelDataUrl": f"s3://{S3_BUCKET}/training/{TRAINING_JOB_NAME}/output/model.tar.gz",
+        }
+    ],
+    "ExecutionRoleArn": f"{ROLE}",
+    "EnableNetworkIsolation": False,
+}
+
+SAGEMAKER_INFERENCE_CONFIG = {
+    "TransformJobName": "sample_transform_job",
+    "ModelName": MODEL_NAME,
+    "TransformInput": {
+        "DataSource": {
+            "S3DataSource": {
+                "S3DataType": "S3Prefix",
+                "S3Uri": f"s3://{S3_BUCKET}/config/config_date.yml",
+            }
+        },
+        "ContentType": "application/x-yaml",
+        "CompressionType": "None",
+        "SplitType": "None",
+    },
+    "TransformOutput": {"S3OutputPath": f"s3://{S3_BUCKET}/inferencing/output/"},
+    "TransformResources": {"InstanceType": "ml.m5.large", "InstanceCount": 1},
+}
+
+# [START howto_operator_sagemaker]
+with DAG(
+    "sample_sagemaker_dag",
+    schedule_interval=None,
+    start_date=datetime(2022, 2, 21),
+    catchup=False,
+) as dag:
+    sagemaker_processing_task = SageMakerProcessingOperator(
+        config=SAGEMAKER_PROCESSING_JOB_CONFIG,
+        aws_conn_id="aws_default",
+        task_id="sagemaker_preprocessing_task",
+    )
+
+    training_task = SageMakerTrainingOperator(
+        config=SAGEMAKER_TRAINING_JOB_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_training_task"
+    )
+
+    model_create_task = SageMakerModelOperator(
+        config=SAGEMAKER_CREATE_MODEL_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_create_model_task"
+    )
+
+    inference_task = SageMakerTransformOperator(
+        config=SAGEMAKER_INFERENCE_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_inference_task"
+    )
+
+    model_delete_task = SageMakerDeleteModelOperator(
+        task_id="sagemaker_delete_model_task", config={'ModelName': MODEL_NAME}, aws_conn_id="aws_default"
+    )

Review comment:
       I never worked with sage maker before so this may be a dumb question but the flow here is a bit odd to me.
   To delete something you must first create it. What is the task that creates the model?
   I'm asking because you are asking to delete `"sample_model"` where was this model created?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#issuecomment-1051112684


   ![image](https://user-images.githubusercontent.com/1920178/155771942-e64c26d0-09ad-4a39-ab61-e12563be3d8c.png)
   @hsrocks The failing test doesn't appear to be related.  The file wasn't touched by this PR and the line that is failing hasn't been touched in months, but @ashb was in that file yesterday and may have more info.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r814039556



##########
File path: airflow/providers/amazon/aws/example_dags/example_sagemaker.py
##########
@@ -0,0 +1,179 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from os import environ
+
+from airflow import DAG
+from airflow.providers.amazon.aws.operators.sagemaker import (
+    SageMakerDeleteModelOperator,
+    SageMakerModelOperator,
+    SageMakerProcessingOperator,
+    SageMakerTrainingOperator,
+    SageMakerTransformOperator,
+)
+
+model_name = "sample_model"
+training_job_name = 'sample_training'
+image_uri = environ.get('ECR_IMAGE_URI', '123456789012.dkr.ecr.us-east-1.amazonaws.com/repo_name')
+s3_bucket = environ.get('BUCKET_NAME', 'test-airflow-12345')
+role = environ.get('SAGEMAKER_ROLE_ARN', 'arn:aws:iam::123456789012:role/role_name')
+
+sagemaker_processing_job_config = {
+    "ProcessingJobName": "sample_processing_job",
+    "ProcessingInputs": [
+        {
+            "InputName": "input",
+            "AppManaged": False,
+            "S3Input": {
+                "S3Uri": f"s3://{s3_bucket}/preprocessing/input/",
+                "LocalPath": "/opt/ml/processing/input/",
+                "S3DataType": "S3Prefix",
+                "S3InputMode": "File",
+                "S3DataDistributionType": "FullyReplicated",
+                "S3CompressionType": "None",
+            },
+        },
+    ],
+    "ProcessingOutputConfig": {
+        "Outputs": [
+            {
+                "OutputName": "output",
+                "S3Output": {
+                    "S3Uri": f"s3://{s3_bucket}/preprocessing/output/",
+                    "LocalPath": "/opt/ml/processing/output/",
+                    "S3UploadMode": "EndOfJob",
+                },
+                "AppManaged": False,
+            }
+        ]
+    },
+    "ProcessingResources": {
+        "ClusterConfig": {
+            "InstanceCount": 1,
+            "InstanceType": "ml.m5.large",
+            "VolumeSizeInGB": 5,
+        }
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 3600},
+    "AppSpecification": {
+        "ImageUri": f"{image_uri}",
+        "ContainerEntrypoint": ["python3", "./preprocessing.py"],
+    },
+    "RoleArn": f"{role}",
+}
+
+sagemaker_training_job_config = {
+    "AlgorithmSpecification": {
+        "TrainingImage": f"{image_uri}",
+        "TrainingInputMode": "File",
+    },
+    "InputDataConfig": [
+        {
+            "ChannelName": "config",
+            "DataSource": {
+                "S3DataSource": {
+                    "S3DataType": "S3Prefix",
+                    "S3Uri": f"s3://{s3_bucket}/config/",
+                    "S3DataDistributionType": "FullyReplicated",
+                }
+            },
+            "CompressionType": "None",
+            "RecordWrapperType": "None",
+        },
+    ],
+    "OutputDataConfig": {
+        "KmsKeyId": "",
+        "S3OutputPath": f"s3://{s3_bucket}/training/",
+    },
+    "ResourceConfig": {
+        "InstanceType": "ml.m5.large",
+        "InstanceCount": 1,
+        "VolumeSizeInGB": 5,
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 6000},
+    "RoleArn": f"{role}",
+    "EnableNetworkIsolation": False,
+    "EnableInterContainerTrafficEncryption": False,
+    "EnableManagedSpotTraining": False,
+    "TrainingJobName": training_job_name,
+}
+
+sagemaker_create_model_config = {
+    "ModelName": model_name,
+    "Containers": [
+        {
+            "Image": f"{image_uri}",
+            "Mode": "SingleModel",
+            "ModelDataUrl": f"s3://{s3_bucket}/training/{training_job_name}/output/model.tar.gz",
+        }
+    ],
+    "ExecutionRoleArn": f"{role}",
+    "EnableNetworkIsolation": False,
+}
+
+sagemaker_inference_config = {
+    "TransformJobName": "sample_transform_job",
+    "ModelName": model_name,
+    "TransformInput": {
+        "DataSource": {
+            "S3DataSource": {
+                "S3DataType": "S3Prefix",
+                "S3Uri": f"s3://{s3_bucket}/config/config_date.yml",
+            }
+        },
+        "ContentType": "application/x-yaml",
+        "CompressionType": "None",
+        "SplitType": "None",
+    },
+    "TransformOutput": {"S3OutputPath": f"s3://{s3_bucket}/inferencing/output/"},
+    "TransformResources": {"InstanceType": "ml.m5.large", "InstanceCount": 1},
+}
+
+# [START howto_operator_sagemaker]
+with DAG(
+    "sample_sagemaker_dag",
+    schedule_interval=None,
+    start_date=datetime(2022, 2, 21),
+    catchup=False,
+) as dag:
+    sagemaker_processing_task = SageMakerProcessingOperator(
+        config=sagemaker_processing_job_config,
+        aws_conn_id="aws_default",
+        task_id="sagemaker_preprocessing_task",
+    )
+
+    training_task = SageMakerTrainingOperator(
+        config=sagemaker_training_job_config, aws_conn_id="aws_default", task_id="sagemaker_training_task"
+    )
+
+    model_delete_task = SageMakerDeleteModelOperator(
+        task_id="sagemaker_delete_model_task",
+        model_name=model_name,

Review comment:
       I got the intent. Thanks!
   
   So with previous changes the delete model in case the model was not present does not use to through ValidationException or ResourceNotFound exception when the model was not present and was simply used to return false so the example DAG was working fine and I have tested it at my end with sample bucket and s3 path. But with the latest changes the exception will come so I have updated the example based on latest changes as well. Please check and resolve. Thanks :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r814037379



##########
File path: airflow/providers/amazon/aws/example_dags/example_sagemaker.py
##########
@@ -0,0 +1,179 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from os import environ
+
+from airflow import DAG
+from airflow.providers.amazon.aws.operators.sagemaker import (
+    SageMakerDeleteModelOperator,
+    SageMakerModelOperator,
+    SageMakerProcessingOperator,
+    SageMakerTrainingOperator,
+    SageMakerTransformOperator,
+)
+
+model_name = "sample_model"
+training_job_name = 'sample_training'
+image_uri = environ.get('ECR_IMAGE_URI', '123456789012.dkr.ecr.us-east-1.amazonaws.com/repo_name')
+s3_bucket = environ.get('BUCKET_NAME', 'test-airflow-12345')
+role = environ.get('SAGEMAKER_ROLE_ARN', 'arn:aws:iam::123456789012:role/role_name')
+
+sagemaker_processing_job_config = {
+    "ProcessingJobName": "sample_processing_job",
+    "ProcessingInputs": [
+        {
+            "InputName": "input",
+            "AppManaged": False,
+            "S3Input": {
+                "S3Uri": f"s3://{s3_bucket}/preprocessing/input/",
+                "LocalPath": "/opt/ml/processing/input/",
+                "S3DataType": "S3Prefix",
+                "S3InputMode": "File",
+                "S3DataDistributionType": "FullyReplicated",
+                "S3CompressionType": "None",
+            },
+        },
+    ],
+    "ProcessingOutputConfig": {
+        "Outputs": [
+            {
+                "OutputName": "output",
+                "S3Output": {
+                    "S3Uri": f"s3://{s3_bucket}/preprocessing/output/",
+                    "LocalPath": "/opt/ml/processing/output/",
+                    "S3UploadMode": "EndOfJob",
+                },
+                "AppManaged": False,
+            }
+        ]
+    },
+    "ProcessingResources": {
+        "ClusterConfig": {
+            "InstanceCount": 1,
+            "InstanceType": "ml.m5.large",
+            "VolumeSizeInGB": 5,
+        }
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 3600},
+    "AppSpecification": {
+        "ImageUri": f"{image_uri}",
+        "ContainerEntrypoint": ["python3", "./preprocessing.py"],
+    },
+    "RoleArn": f"{role}",
+}
+
+sagemaker_training_job_config = {
+    "AlgorithmSpecification": {
+        "TrainingImage": f"{image_uri}",
+        "TrainingInputMode": "File",
+    },
+    "InputDataConfig": [
+        {
+            "ChannelName": "config",
+            "DataSource": {
+                "S3DataSource": {
+                    "S3DataType": "S3Prefix",
+                    "S3Uri": f"s3://{s3_bucket}/config/",
+                    "S3DataDistributionType": "FullyReplicated",
+                }
+            },
+            "CompressionType": "None",
+            "RecordWrapperType": "None",
+        },
+    ],
+    "OutputDataConfig": {
+        "KmsKeyId": "",
+        "S3OutputPath": f"s3://{s3_bucket}/training/",
+    },
+    "ResourceConfig": {
+        "InstanceType": "ml.m5.large",
+        "InstanceCount": 1,
+        "VolumeSizeInGB": 5,
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 6000},
+    "RoleArn": f"{role}",
+    "EnableNetworkIsolation": False,
+    "EnableInterContainerTrafficEncryption": False,
+    "EnableManagedSpotTraining": False,
+    "TrainingJobName": training_job_name,
+}
+
+sagemaker_create_model_config = {
+    "ModelName": model_name,
+    "Containers": [
+        {
+            "Image": f"{image_uri}",
+            "Mode": "SingleModel",
+            "ModelDataUrl": f"s3://{s3_bucket}/training/{training_job_name}/output/model.tar.gz",
+        }
+    ],
+    "ExecutionRoleArn": f"{role}",
+    "EnableNetworkIsolation": False,
+}
+
+sagemaker_inference_config = {
+    "TransformJobName": "sample_transform_job",
+    "ModelName": model_name,
+    "TransformInput": {
+        "DataSource": {
+            "S3DataSource": {
+                "S3DataType": "S3Prefix",
+                "S3Uri": f"s3://{s3_bucket}/config/config_date.yml",
+            }
+        },
+        "ContentType": "application/x-yaml",
+        "CompressionType": "None",
+        "SplitType": "None",
+    },
+    "TransformOutput": {"S3OutputPath": f"s3://{s3_bucket}/inferencing/output/"},
+    "TransformResources": {"InstanceType": "ml.m5.large", "InstanceCount": 1},
+}
+
+# [START howto_operator_sagemaker]
+with DAG(
+    "sample_sagemaker_dag",
+    schedule_interval=None,
+    start_date=datetime(2022, 2, 21),
+    catchup=False,
+) as dag:
+    sagemaker_processing_task = SageMakerProcessingOperator(
+        config=sagemaker_processing_job_config,
+        aws_conn_id="aws_default",
+        task_id="sagemaker_preprocessing_task",
+    )
+
+    training_task = SageMakerTrainingOperator(
+        config=sagemaker_training_job_config, aws_conn_id="aws_default", task_id="sagemaker_training_task"
+    )
+
+    model_delete_task = SageMakerDeleteModelOperator(
+        task_id="sagemaker_delete_model_task",
+        model_name=model_name,
+        aws_conn_id="aws_default",
+        dag=dag,

Review comment:
       Done the changes




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r814040246



##########
File path: airflow/providers/amazon/aws/example_dags/example_sagemaker.py
##########
@@ -0,0 +1,179 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from os import environ
+
+from airflow import DAG
+from airflow.providers.amazon.aws.operators.sagemaker import (
+    SageMakerDeleteModelOperator,
+    SageMakerModelOperator,
+    SageMakerProcessingOperator,
+    SageMakerTrainingOperator,
+    SageMakerTransformOperator,
+)
+
+model_name = "sample_model"
+training_job_name = 'sample_training'

Review comment:
       Have corrected the quotes and global variables. Thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r814338452



##########
File path: tests/providers/amazon/aws/operators/test_sagemaker_model.py
##########
@@ -63,3 +66,17 @@ def test_execute_with_failure(self, mock_model, mock_client):
         mock_model.return_value = {'ModelArn': 'testarn', 'ResponseMetadata': {'HTTPStatusCode': 404}}
         with pytest.raises(AirflowException):
             self.sagemaker.execute(None)
+
+
+class TestSageMakerDeleteModelOperator(unittest.TestCase):
+    def setUp(self):
+        self.sagemaker = SageMakerDeleteModelOperator(
+            task_id='test_sagemaker_operator', aws_conn_id='sagemaker_test_id', model_name='test'
+        )
+
+    @mock.patch.object(SageMakerHook, 'get_conn')
+    @mock.patch.object(SageMakerHook, 'delete_model')
+    def test_model_delete(self, mock_model, mock_client):

Review comment:
       Thank you.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
eladkal commented on pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#issuecomment-1046056598


   Yep probably not related. Once you will add the docs and example I'll review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks commented on pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks commented on pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#issuecomment-1045984330


   > Can you please add docs and example dag? https://github.com/apache/airflow/tree/main/airflow/providers/amazon/aws/example_dags https://github.com/apache/airflow/tree/main/docs/apache-airflow-providers-amazon/operators
   
   Sure! Valid point @eladkal . Can see its missing for sagemaker altogether. Will do it . The failed test '
   Tests / MySQL5.7, Py3.7: Always Integration Providers (pull_request) ' is not related to changes. Can you please suggest @eladkal ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi edited a comment on pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
ferruzzi edited a comment on pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#issuecomment-1051112684


   ![image](https://user-images.githubusercontent.com/1920178/155771942-e64c26d0-09ad-4a39-ab61-e12563be3d8c.png)
   @hsrocks The failing test doesn't appear to be related.  The file wasn't touched by this PR and the line that is failing hasn't been touched in months, but @ashb was in that file yesterday and may have more info.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r814402742



##########
File path: airflow/providers/amazon/aws/example_dags/example_sagemaker.py
##########
@@ -0,0 +1,179 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from os import environ
+
+from airflow import DAG
+from airflow.providers.amazon.aws.operators.sagemaker import (
+    SageMakerDeleteModelOperator,
+    SageMakerModelOperator,
+    SageMakerProcessingOperator,
+    SageMakerTrainingOperator,
+    SageMakerTransformOperator,
+)
+
+model_name = "sample_model"
+training_job_name = 'sample_training'
+image_uri = environ.get('ECR_IMAGE_URI', '123456789012.dkr.ecr.us-east-1.amazonaws.com/repo_name')
+s3_bucket = environ.get('BUCKET_NAME', 'test-airflow-12345')
+role = environ.get('SAGEMAKER_ROLE_ARN', 'arn:aws:iam::123456789012:role/role_name')
+
+sagemaker_processing_job_config = {
+    "ProcessingJobName": "sample_processing_job",
+    "ProcessingInputs": [
+        {
+            "InputName": "input",
+            "AppManaged": False,
+            "S3Input": {
+                "S3Uri": f"s3://{s3_bucket}/preprocessing/input/",
+                "LocalPath": "/opt/ml/processing/input/",
+                "S3DataType": "S3Prefix",
+                "S3InputMode": "File",
+                "S3DataDistributionType": "FullyReplicated",
+                "S3CompressionType": "None",
+            },
+        },
+    ],
+    "ProcessingOutputConfig": {
+        "Outputs": [
+            {
+                "OutputName": "output",
+                "S3Output": {
+                    "S3Uri": f"s3://{s3_bucket}/preprocessing/output/",
+                    "LocalPath": "/opt/ml/processing/output/",
+                    "S3UploadMode": "EndOfJob",
+                },
+                "AppManaged": False,
+            }
+        ]
+    },
+    "ProcessingResources": {
+        "ClusterConfig": {
+            "InstanceCount": 1,
+            "InstanceType": "ml.m5.large",
+            "VolumeSizeInGB": 5,
+        }
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 3600},
+    "AppSpecification": {
+        "ImageUri": f"{image_uri}",
+        "ContainerEntrypoint": ["python3", "./preprocessing.py"],
+    },
+    "RoleArn": f"{role}",
+}
+
+sagemaker_training_job_config = {
+    "AlgorithmSpecification": {
+        "TrainingImage": f"{image_uri}",
+        "TrainingInputMode": "File",
+    },
+    "InputDataConfig": [
+        {
+            "ChannelName": "config",
+            "DataSource": {
+                "S3DataSource": {
+                    "S3DataType": "S3Prefix",
+                    "S3Uri": f"s3://{s3_bucket}/config/",
+                    "S3DataDistributionType": "FullyReplicated",
+                }
+            },
+            "CompressionType": "None",
+            "RecordWrapperType": "None",
+        },
+    ],
+    "OutputDataConfig": {
+        "KmsKeyId": "",
+        "S3OutputPath": f"s3://{s3_bucket}/training/",
+    },
+    "ResourceConfig": {
+        "InstanceType": "ml.m5.large",
+        "InstanceCount": 1,
+        "VolumeSizeInGB": 5,
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 6000},
+    "RoleArn": f"{role}",
+    "EnableNetworkIsolation": False,
+    "EnableInterContainerTrafficEncryption": False,
+    "EnableManagedSpotTraining": False,
+    "TrainingJobName": training_job_name,
+}
+
+sagemaker_create_model_config = {
+    "ModelName": model_name,
+    "Containers": [
+        {
+            "Image": f"{image_uri}",
+            "Mode": "SingleModel",
+            "ModelDataUrl": f"s3://{s3_bucket}/training/{training_job_name}/output/model.tar.gz",
+        }
+    ],
+    "ExecutionRoleArn": f"{role}",
+    "EnableNetworkIsolation": False,
+}
+
+sagemaker_inference_config = {
+    "TransformJobName": "sample_transform_job",
+    "ModelName": model_name,
+    "TransformInput": {
+        "DataSource": {
+            "S3DataSource": {
+                "S3DataType": "S3Prefix",
+                "S3Uri": f"s3://{s3_bucket}/config/config_date.yml",
+            }
+        },
+        "ContentType": "application/x-yaml",
+        "CompressionType": "None",
+        "SplitType": "None",
+    },
+    "TransformOutput": {"S3OutputPath": f"s3://{s3_bucket}/inferencing/output/"},
+    "TransformResources": {"InstanceType": "ml.m5.large", "InstanceCount": 1},
+}
+
+# [START howto_operator_sagemaker]
+with DAG(
+    "sample_sagemaker_dag",
+    schedule_interval=None,
+    start_date=datetime(2022, 2, 21),
+    catchup=False,
+) as dag:
+    sagemaker_processing_task = SageMakerProcessingOperator(
+        config=sagemaker_processing_job_config,
+        aws_conn_id="aws_default",

Review comment:
       @ferruzzi  shall we open an issue for this and might be i can take this up but I believe this will require changes in most of the operator not only related to amazon one. Right?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks commented on pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks commented on pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#issuecomment-1050419819


   > Looks like my points have been addressed other than that new question about the indenting of the START/END flag in the sample DAG, which may or may not be a concern. Thanks for the changes.
   
   Thanks a lot for review :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r814037163



##########
File path: airflow/providers/amazon/aws/operators/sagemaker.py
##########
@@ -608,3 +608,26 @@ def _check_if_job_exists(self) -> None:
                 raise AirflowException(
                     f'A SageMaker training job with name {training_job_name} already exists.'
                 )
+
+
+class SageMakerDeleteModelOperator(BaseOperator):
+    """Deletes a SageMaker model.
+
+    This operator returns True if model was present and deleted else return False if Model was not present.
+
+    :param model_name: The name of Sagemaker Model (templated).

Review comment:
       Corrected it. Thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#issuecomment-1051125649


   If in doubt rebase. In this case it will help as it has been fixed on main.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r814339781



##########
File path: airflow/providers/amazon/aws/example_dags/example_sagemaker.py
##########
@@ -0,0 +1,179 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from os import environ
+
+from airflow import DAG
+from airflow.providers.amazon.aws.operators.sagemaker import (
+    SageMakerDeleteModelOperator,
+    SageMakerModelOperator,
+    SageMakerProcessingOperator,
+    SageMakerTrainingOperator,
+    SageMakerTransformOperator,
+)
+
+model_name = "sample_model"
+training_job_name = 'sample_training'

Review comment:
       I appreciate that thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r815426214



##########
File path: docs/apache-airflow-providers-amazon/operators/sagemaker.rst
##########
@@ -0,0 +1,72 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon SageMaker Operators
+========================================
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon SageMaker integration provides several operators to create and interact with
+SageMaker Jobs.
+
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerDeleteModelOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerModelOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerProcessingOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerTrainingOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerTransformOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerTuningOperator`
+
+One example_dag is provided which showcases some of these operators in action.
+
+ - example_sagemaker.py
+
+--------------------------------------------------------------

Review comment:
       @eladkal the changes are pushed for this change. Please review it . Thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
eladkal commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r813322129



##########
File path: airflow/providers/amazon/aws/operators/sagemaker.py
##########
@@ -608,3 +608,26 @@ def _check_if_job_exists(self) -> None:
                 raise AirflowException(
                     f'A SageMaker training job with name {training_job_name} already exists.'
                 )
+
+
+class SageMakerDeleteModelOperator(BaseOperator):

Review comment:
       Why this operator inherit from `BaseOperator` and not from `SageMakerBaseOperator` ?

##########
File path: airflow/providers/amazon/aws/operators/sagemaker.py
##########
@@ -608,3 +608,26 @@ def _check_if_job_exists(self) -> None:
                 raise AirflowException(
                     f'A SageMaker training job with name {training_job_name} already exists.'
                 )
+
+
+class SageMakerDeleteModelOperator(BaseOperator):
+    """Deletes a SageMaker model.
+
+    This operator returns True if model was present and deleted else return False if Model was not present.
+
+    :param model_name: The name of Sagemaker Model (templated).

Review comment:
       you mentioned it's templated but you didn't specify templated fields

##########
File path: airflow/providers/amazon/aws/example_dags/example_sagemaker.py
##########
@@ -0,0 +1,179 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from os import environ
+
+from airflow import DAG
+from airflow.providers.amazon.aws.operators.sagemaker import (
+    SageMakerDeleteModelOperator,
+    SageMakerModelOperator,
+    SageMakerProcessingOperator,
+    SageMakerTrainingOperator,
+    SageMakerTransformOperator,
+)
+
+model_name = "sample_model"
+training_job_name = 'sample_training'
+image_uri = environ.get('ECR_IMAGE_URI', '123456789012.dkr.ecr.us-east-1.amazonaws.com/repo_name')
+s3_bucket = environ.get('BUCKET_NAME', 'test-airflow-12345')
+role = environ.get('SAGEMAKER_ROLE_ARN', 'arn:aws:iam::123456789012:role/role_name')
+
+sagemaker_processing_job_config = {
+    "ProcessingJobName": "sample_processing_job",
+    "ProcessingInputs": [
+        {
+            "InputName": "input",
+            "AppManaged": False,
+            "S3Input": {
+                "S3Uri": f"s3://{s3_bucket}/preprocessing/input/",
+                "LocalPath": "/opt/ml/processing/input/",
+                "S3DataType": "S3Prefix",
+                "S3InputMode": "File",
+                "S3DataDistributionType": "FullyReplicated",
+                "S3CompressionType": "None",
+            },
+        },
+    ],
+    "ProcessingOutputConfig": {
+        "Outputs": [
+            {
+                "OutputName": "output",
+                "S3Output": {
+                    "S3Uri": f"s3://{s3_bucket}/preprocessing/output/",
+                    "LocalPath": "/opt/ml/processing/output/",
+                    "S3UploadMode": "EndOfJob",
+                },
+                "AppManaged": False,
+            }
+        ]
+    },
+    "ProcessingResources": {
+        "ClusterConfig": {
+            "InstanceCount": 1,
+            "InstanceType": "ml.m5.large",
+            "VolumeSizeInGB": 5,
+        }
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 3600},
+    "AppSpecification": {
+        "ImageUri": f"{image_uri}",
+        "ContainerEntrypoint": ["python3", "./preprocessing.py"],
+    },
+    "RoleArn": f"{role}",
+}
+
+sagemaker_training_job_config = {
+    "AlgorithmSpecification": {
+        "TrainingImage": f"{image_uri}",
+        "TrainingInputMode": "File",
+    },
+    "InputDataConfig": [
+        {
+            "ChannelName": "config",
+            "DataSource": {
+                "S3DataSource": {
+                    "S3DataType": "S3Prefix",
+                    "S3Uri": f"s3://{s3_bucket}/config/",
+                    "S3DataDistributionType": "FullyReplicated",
+                }
+            },
+            "CompressionType": "None",
+            "RecordWrapperType": "None",
+        },
+    ],
+    "OutputDataConfig": {
+        "KmsKeyId": "",
+        "S3OutputPath": f"s3://{s3_bucket}/training/",
+    },
+    "ResourceConfig": {
+        "InstanceType": "ml.m5.large",
+        "InstanceCount": 1,
+        "VolumeSizeInGB": 5,
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 6000},
+    "RoleArn": f"{role}",
+    "EnableNetworkIsolation": False,
+    "EnableInterContainerTrafficEncryption": False,
+    "EnableManagedSpotTraining": False,
+    "TrainingJobName": training_job_name,
+}
+
+sagemaker_create_model_config = {
+    "ModelName": model_name,
+    "Containers": [
+        {
+            "Image": f"{image_uri}",
+            "Mode": "SingleModel",
+            "ModelDataUrl": f"s3://{s3_bucket}/training/{training_job_name}/output/model.tar.gz",
+        }
+    ],
+    "ExecutionRoleArn": f"{role}",
+    "EnableNetworkIsolation": False,
+}
+
+sagemaker_inference_config = {
+    "TransformJobName": "sample_transform_job",
+    "ModelName": model_name,
+    "TransformInput": {
+        "DataSource": {
+            "S3DataSource": {
+                "S3DataType": "S3Prefix",
+                "S3Uri": f"s3://{s3_bucket}/config/config_date.yml",
+            }
+        },
+        "ContentType": "application/x-yaml",
+        "CompressionType": "None",
+        "SplitType": "None",
+    },
+    "TransformOutput": {"S3OutputPath": f"s3://{s3_bucket}/inferencing/output/"},
+    "TransformResources": {"InstanceType": "ml.m5.large", "InstanceCount": 1},
+}
+
+# [START howto_operator_sagemaker]
+with DAG(
+    "sample_sagemaker_dag",
+    schedule_interval=None,
+    start_date=datetime(2022, 2, 21),
+    catchup=False,
+) as dag:
+    sagemaker_processing_task = SageMakerProcessingOperator(
+        config=sagemaker_processing_job_config,
+        aws_conn_id="aws_default",
+        task_id="sagemaker_preprocessing_task",
+    )
+
+    training_task = SageMakerTrainingOperator(
+        config=sagemaker_training_job_config, aws_conn_id="aws_default", task_id="sagemaker_training_task"
+    )
+
+    model_delete_task = SageMakerDeleteModelOperator(
+        task_id="sagemaker_delete_model_task",
+        model_name=model_name,
+        aws_conn_id="aws_default",
+        dag=dag,

Review comment:
       redundant

##########
File path: airflow/providers/amazon/aws/example_dags/example_sagemaker.py
##########
@@ -0,0 +1,179 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from os import environ
+
+from airflow import DAG
+from airflow.providers.amazon.aws.operators.sagemaker import (
+    SageMakerDeleteModelOperator,
+    SageMakerModelOperator,
+    SageMakerProcessingOperator,
+    SageMakerTrainingOperator,
+    SageMakerTransformOperator,
+)
+
+model_name = "sample_model"
+training_job_name = 'sample_training'
+image_uri = environ.get('ECR_IMAGE_URI', '123456789012.dkr.ecr.us-east-1.amazonaws.com/repo_name')
+s3_bucket = environ.get('BUCKET_NAME', 'test-airflow-12345')
+role = environ.get('SAGEMAKER_ROLE_ARN', 'arn:aws:iam::123456789012:role/role_name')
+
+sagemaker_processing_job_config = {
+    "ProcessingJobName": "sample_processing_job",
+    "ProcessingInputs": [
+        {
+            "InputName": "input",
+            "AppManaged": False,
+            "S3Input": {
+                "S3Uri": f"s3://{s3_bucket}/preprocessing/input/",
+                "LocalPath": "/opt/ml/processing/input/",
+                "S3DataType": "S3Prefix",
+                "S3InputMode": "File",
+                "S3DataDistributionType": "FullyReplicated",
+                "S3CompressionType": "None",
+            },
+        },
+    ],
+    "ProcessingOutputConfig": {
+        "Outputs": [
+            {
+                "OutputName": "output",
+                "S3Output": {
+                    "S3Uri": f"s3://{s3_bucket}/preprocessing/output/",
+                    "LocalPath": "/opt/ml/processing/output/",
+                    "S3UploadMode": "EndOfJob",
+                },
+                "AppManaged": False,
+            }
+        ]
+    },
+    "ProcessingResources": {
+        "ClusterConfig": {
+            "InstanceCount": 1,
+            "InstanceType": "ml.m5.large",
+            "VolumeSizeInGB": 5,
+        }
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 3600},
+    "AppSpecification": {
+        "ImageUri": f"{image_uri}",
+        "ContainerEntrypoint": ["python3", "./preprocessing.py"],
+    },
+    "RoleArn": f"{role}",
+}
+
+sagemaker_training_job_config = {
+    "AlgorithmSpecification": {
+        "TrainingImage": f"{image_uri}",
+        "TrainingInputMode": "File",
+    },
+    "InputDataConfig": [
+        {
+            "ChannelName": "config",
+            "DataSource": {
+                "S3DataSource": {
+                    "S3DataType": "S3Prefix",
+                    "S3Uri": f"s3://{s3_bucket}/config/",
+                    "S3DataDistributionType": "FullyReplicated",
+                }
+            },
+            "CompressionType": "None",
+            "RecordWrapperType": "None",
+        },
+    ],
+    "OutputDataConfig": {
+        "KmsKeyId": "",
+        "S3OutputPath": f"s3://{s3_bucket}/training/",
+    },
+    "ResourceConfig": {
+        "InstanceType": "ml.m5.large",
+        "InstanceCount": 1,
+        "VolumeSizeInGB": 5,
+    },
+    "StoppingCondition": {"MaxRuntimeInSeconds": 6000},
+    "RoleArn": f"{role}",
+    "EnableNetworkIsolation": False,
+    "EnableInterContainerTrafficEncryption": False,
+    "EnableManagedSpotTraining": False,
+    "TrainingJobName": training_job_name,
+}
+
+sagemaker_create_model_config = {
+    "ModelName": model_name,
+    "Containers": [
+        {
+            "Image": f"{image_uri}",
+            "Mode": "SingleModel",
+            "ModelDataUrl": f"s3://{s3_bucket}/training/{training_job_name}/output/model.tar.gz",
+        }
+    ],
+    "ExecutionRoleArn": f"{role}",
+    "EnableNetworkIsolation": False,
+}
+
+sagemaker_inference_config = {
+    "TransformJobName": "sample_transform_job",
+    "ModelName": model_name,
+    "TransformInput": {
+        "DataSource": {
+            "S3DataSource": {
+                "S3DataType": "S3Prefix",
+                "S3Uri": f"s3://{s3_bucket}/config/config_date.yml",
+            }
+        },
+        "ContentType": "application/x-yaml",
+        "CompressionType": "None",
+        "SplitType": "None",
+    },
+    "TransformOutput": {"S3OutputPath": f"s3://{s3_bucket}/inferencing/output/"},
+    "TransformResources": {"InstanceType": "ml.m5.large", "InstanceCount": 1},
+}
+
+# [START howto_operator_sagemaker]
+with DAG(
+    "sample_sagemaker_dag",
+    schedule_interval=None,
+    start_date=datetime(2022, 2, 21),
+    catchup=False,
+) as dag:
+    sagemaker_processing_task = SageMakerProcessingOperator(
+        config=sagemaker_processing_job_config,
+        aws_conn_id="aws_default",
+        task_id="sagemaker_preprocessing_task",
+    )
+
+    training_task = SageMakerTrainingOperator(
+        config=sagemaker_training_job_config, aws_conn_id="aws_default", task_id="sagemaker_training_task"
+    )
+
+    model_delete_task = SageMakerDeleteModelOperator(
+        task_id="sagemaker_delete_model_task",
+        model_name=model_name,

Review comment:
       I don't follow on that.
   Was `sample_model` existed before this DAG started?
   
   the goal of example dags is that users should be able to run them and it will work for them.
   Is this example really runnable?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks commented on pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks commented on pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#issuecomment-1046727186


   > [eladkal](/eladkal)
   
   
   
   > > Can you please add docs and example dag? https://github.com/apache/airflow/tree/main/airflow/providers/amazon/aws/example_dags https://github.com/apache/airflow/tree/main/docs/apache-airflow-providers-amazon/operators
   > 
   > Sure! Valid point @eladkal . Can see its missing for sagemaker altogether. Will do it . The failed test ' Tests / MySQL5.7, Py3.7: Always Integration Providers (pull_request) ' is not related to changes. Can you please suggest @eladkal ?
   
   
   
   > [eladkal](/eladkal)
   
   can you please check the code and share your feedback? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r813427675



##########
File path: tests/providers/amazon/aws/operators/test_sagemaker_model.py
##########
@@ -63,3 +66,17 @@ def test_execute_with_failure(self, mock_model, mock_client):
         mock_model.return_value = {'ModelArn': 'testarn', 'ResponseMetadata': {'HTTPStatusCode': 404}}
         with pytest.raises(AirflowException):
             self.sagemaker.execute(None)
+
+
+class TestSageMakerDeleteModelOperator(unittest.TestCase):
+    def setUp(self):
+        self.sagemaker = SageMakerDeleteModelOperator(
+            task_id='test_sagemaker_operator', aws_conn_id='sagemaker_test_id', model_name='test'
+        )
+
+    @mock.patch.object(SageMakerHook, 'get_conn')
+    @mock.patch.object(SageMakerHook, 'delete_model')
+    def test_model_delete(self, mock_model, mock_client):

Review comment:
       Nitpick, take it or leave it, but I'd rename `mock_model` to `mock_delete_model` for clarity, but I suppose that's personal preference.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r814029742



##########
File path: airflow/providers/amazon/aws/operators/sagemaker.py
##########
@@ -608,3 +608,26 @@ def _check_if_job_exists(self) -> None:
                 raise AirflowException(
                     f'A SageMaker training job with name {training_job_name} already exists.'
                 )
+
+
+class SageMakerDeleteModelOperator(BaseOperator):

Review comment:
       Done the changes. I implemented BaseOperator because with SagemakerBaseOperator , we have to pass config dictionary but delete model call just need ModelName as argument but to ensure consistency with other Sagemaker Operator I have made necessary. Please check and resolve. Thanks




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r814041892



##########
File path: airflow/providers/amazon/aws/hooks/sagemaker.py
##########
@@ -908,3 +908,19 @@ def find_processing_job_by_name(self, processing_job_name: str) -> bool:
             if e.response['Error']['Code'] in ['ValidationException', 'ResourceNotFound']:
                 return False
             raise
+
+    def delete_model(self, model_name: str):
+        """Delete Sagemaker model
+        :param model_name: (optional) name of the model
+        :return: True if Model exists and deleted else return False

Review comment:
       Yes my intent was that the delete call should not fail when model is not present. So my initial thought was :
   on first run of workflow assume the sagemaker model are not created so the workflow will always fail for any use case. So I thought of return false for ValidationException where model is not present but I got ur point to keep behavior as close to boto3 so that its more understandable. Made the changes based on that. Thanks a lot 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#issuecomment-1053531173


   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal merged pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
eladkal merged pull request #21673:
URL: https://github.com/apache/airflow/pull/21673


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsrocks commented on a change in pull request #21673: Implement a Sagemaker DeleteModelOperator and Delete model hook.

Posted by GitBox <gi...@apache.org>.
hsrocks commented on a change in pull request #21673:
URL: https://github.com/apache/airflow/pull/21673#discussion_r815426214



##########
File path: docs/apache-airflow-providers-amazon/operators/sagemaker.rst
##########
@@ -0,0 +1,72 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon SageMaker Operators
+========================================
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon SageMaker integration provides several operators to create and interact with
+SageMaker Jobs.
+
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerDeleteModelOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerModelOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerProcessingOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerTrainingOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerTransformOperator`
+  - :class:`~airflow.providers.amazon.aws.operators.sagemaker.SageMakerTuningOperator`
+
+One example_dag is provided which showcases some of these operators in action.
+
+ - example_sagemaker.py
+
+--------------------------------------------------------------

Review comment:
       @eladkal the changes are pushed for this. Please review it . Thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org