You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/06/07 15:03:50 UTC

[GitHub] [airflow] Taragolis opened a new pull request, #24294: Refactoring EmrClusterLink and add for other AWS EMR Operators

Taragolis opened a new pull request, #24294:
URL: https://github.com/apache/airflow/pull/24294

   Amazon provider at that moment has only one External link for `EmrClusterLink`.
   
   The idea was taken from Google Cloud Provider.
   
   Extend:
   - Hook: Return region and aws partition from current connection/session
   - Console link based on aws partition now
   - Link contains region_name now
   
   Add:
   - AwsBaseLink for further Links (Batch, Glue, Cloudwatch, etc.)
   - `EmrClusterLink` links to  `EmrAddStepsOperator`, `EmrModifyClusterOperator`,  `EmrTerminateJobFlowOperator`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on PR #24294:
URL: https://github.com/apache/airflow/pull/24294#issuecomment-1150319263

   @shubham22  Check this out!   We were just talking about getting the Extra Links sections filled out for the operators when there was time!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #24294:
URL: https://github.com/apache/airflow/pull/24294#issuecomment-1150395703

   >I wonder if we can figure out a way to work the operator_extra_links assignment into the AwsBaseOperator that you are working on as well, so we don't assign it each time. 
   
   I think it is hardly possible because each operator could required each set of Links and not all of them related to Hook
   
   Example:
   - For EMR at that moment only link to cluster could be provided because there is no direct link to steps exists.
   - For AWS Batch it could be set of links: Batch Job Detail, Latest Attempt logs in CloudWatch, Batch Job Definition, Batch Queue
   - For Glue Job it could be set of links: Glue Job Detail (Glue ETL Studio), Glue Job Detail (Legacy), Glue Job, Filter to CloudWatch prefixes 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on a diff in pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on code in PR #24294:
URL: https://github.com/apache/airflow/pull/24294#discussion_r892947388


##########
airflow/providers/amazon/aws/links/base.py:
##########
@@ -0,0 +1,87 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from typing import TYPE_CHECKING, ClassVar, Optional
+
+from airflow.models import BaseOperatorLink, XCom
+
+if TYPE_CHECKING:
+    from airflow.models import BaseOperator
+    from airflow.models.taskinstance import TaskInstanceKey
+    from airflow.utils.context import Context
+
+
+BASE_AWS_CONSOLE_LINK = "https://console.{aws_domain}"
+
+
+class BaseAwsLink(BaseOperatorLink):
+    """Base Helper class for constructing AWS Console Link"""
+
+    name: ClassVar[str]
+    key: ClassVar[str]
+    format_str: ClassVar[str]
+
+    @staticmethod
+    def get_aws_domain(aws_partition) -> Optional[str]:
+        if aws_partition == "aws":
+            return "aws.amazon.com"
+        elif aws_partition == "aws-cn":
+            return "amazonaws.cn"
+        elif aws_partition == "aws-us-gov":
+            return "amazonaws-us-gov.com"
+
+        return None
+
+    def get_link(
+        self,
+        operator,
+        dttm: Optional[datetime] = None,
+        ti_key: Optional["TaskInstanceKey"] = None,
+    ) -> str:
+        if ti_key is not None:

Review Comment:
   Huh.   Alright, carry on then.   Sorry about that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] o-nikolas commented on a diff in pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
o-nikolas commented on code in PR #24294:
URL: https://github.com/apache/airflow/pull/24294#discussion_r892865165


##########
airflow/providers/amazon/aws/links/emr.py:
##########
@@ -0,0 +1,30 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.providers.amazon.aws.links.base_aws import BASE_AWS_CONSOLE_LINK, BaseAwsLink
+
+EMR_CLUSTER_LINK = (
+    BASE_AWS_CONSOLE_LINK + "/elasticmapreduce/home?region={region_name}#cluster-details:{job_flow_id}"
+)
+
+
+class EmrClusterLink(BaseAwsLink):

Review Comment:
   If you end up following the above, this module will just contain a single 5-6 line class, I wonder if it would be better suited to live in the Operator module instead of its own module. That way we don't have the extra layer of indirection and we also save from creating new modules, directory, init files etc.
   
   What do you think?



##########
airflow/providers/amazon/aws/links/emr.py:
##########
@@ -0,0 +1,30 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.providers.amazon.aws.links.base_aws import BASE_AWS_CONSOLE_LINK, BaseAwsLink
+
+EMR_CLUSTER_LINK = (

Review Comment:
   Small nit: Any reason not to define this inline inside the class for `format_str` (it's only used once there). It seems private to the `EmrClusterLink` class and it will get messy if more link classes are added to this module in the future (with all the format strings defined outside the classes).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk merged pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
potiuk merged PR #24294:
URL: https://github.com/apache/airflow/pull/24294


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on a diff in pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
Taragolis commented on code in PR #24294:
URL: https://github.com/apache/airflow/pull/24294#discussion_r892897292


##########
airflow/providers/amazon/aws/links/emr.py:
##########
@@ -0,0 +1,30 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.providers.amazon.aws.links.base_aws import BASE_AWS_CONSOLE_LINK, BaseAwsLink
+
+EMR_CLUSTER_LINK = (
+    BASE_AWS_CONSOLE_LINK + "/elasticmapreduce/home?region={region_name}#cluster-details:{job_flow_id}"
+)
+
+
+class EmrClusterLink(BaseAwsLink):

Review Comment:
   It would work for Links to specific services such as EMR, Batch, ECS, EC2 however in case if we would like provide link to CloudWatch, S3 (a lot of transfers operators) it would be better store in separate place. WDYT? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on a diff in pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
Taragolis commented on code in PR #24294:
URL: https://github.com/apache/airflow/pull/24294#discussion_r892806279


##########
airflow/providers/amazon/aws/links/base.py:
##########
@@ -0,0 +1,87 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from typing import TYPE_CHECKING, ClassVar, Optional
+
+from airflow.models import BaseOperatorLink, XCom
+
+if TYPE_CHECKING:
+    from airflow.models import BaseOperator
+    from airflow.models.taskinstance import TaskInstanceKey
+    from airflow.utils.context import Context
+
+
+BASE_AWS_CONSOLE_LINK = "https://console.{aws_domain}"
+
+
+class BaseAwsLink(BaseOperatorLink):
+    """Base Helper class for constructing AWS Console Link"""
+
+    name: ClassVar[str]
+    key: ClassVar[str]
+    format_str: ClassVar[str]
+
+    @staticmethod
+    def get_aws_domain(aws_partition) -> Optional[str]:
+        if aws_partition == "aws":
+            return "aws.amazon.com"
+        elif aws_partition == "aws-cn":
+            return "amazonaws.cn"
+        elif aws_partition == "aws-us-gov":
+            return "amazonaws-us-gov.com"
+
+        return None
+
+    def get_link(
+        self,
+        operator,
+        dttm: Optional[datetime] = None,

Review Comment:
   Oh I grab it from other external links
   
   In BaseOperatorLink mention that this signature is deprecated, need to check from which version of Airflow it deprecated and remove everything related to `dttm`
   https://github.com/apache/airflow/blob/047a6162b0b4cbf07fe2fd978e335839a7d3900b/airflow/models/baseoperator.py#L1771-L1782



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on pull request #24294: Refactoring EmrClusterLink and add for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #24294:
URL: https://github.com/apache/airflow/pull/24294#issuecomment-1149678178

   cc: @ferruzzi if you have a time could you have a look on this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #24294:
URL: https://github.com/apache/airflow/pull/24294#issuecomment-1150399141

   > We had in our backlog, but couldn't get to it as we are prioritizing System Test work.
   
   no problem, anyway there is plenty amount of time to next provider release after 4.0.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on a diff in pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on code in PR #24294:
URL: https://github.com/apache/airflow/pull/24294#discussion_r892812495


##########
airflow/providers/amazon/aws/links/base.py:
##########
@@ -0,0 +1,87 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from typing import TYPE_CHECKING, ClassVar, Optional
+
+from airflow.models import BaseOperatorLink, XCom
+
+if TYPE_CHECKING:
+    from airflow.models import BaseOperator
+    from airflow.models.taskinstance import TaskInstanceKey
+    from airflow.utils.context import Context
+
+
+BASE_AWS_CONSOLE_LINK = "https://console.{aws_domain}"
+
+
+class BaseAwsLink(BaseOperatorLink):
+    """Base Helper class for constructing AWS Console Link"""
+
+    name: ClassVar[str]
+    key: ClassVar[str]
+    format_str: ClassVar[str]
+
+    @staticmethod
+    def get_aws_domain(aws_partition) -> Optional[str]:
+        if aws_partition == "aws":
+            return "aws.amazon.com"
+        elif aws_partition == "aws-cn":
+            return "amazonaws.cn"
+        elif aws_partition == "aws-us-gov":
+            return "amazonaws-us-gov.com"
+
+        return None
+
+    def get_link(
+        self,
+        operator,
+        dttm: Optional[datetime] = None,

Review Comment:
   > Oh I grab it from other external links
   
   Yeah, I've seen `dttm` and similarly abbreviated variable names used pretty commonly in the Airflow codebase.  I guess it's just personal preference.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on a diff in pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
Taragolis commented on code in PR #24294:
URL: https://github.com/apache/airflow/pull/24294#discussion_r892895163


##########
airflow/providers/amazon/aws/links/emr.py:
##########
@@ -0,0 +1,30 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.providers.amazon.aws.links.base_aws import BASE_AWS_CONSOLE_LINK, BaseAwsLink
+
+EMR_CLUSTER_LINK = (

Review Comment:
   Initially I grab main idea from Google Cloud Provider
   
   https://github.com/apache/airflow/blob/047a6162b0b4cbf07fe2fd978e335839a7d3900b/airflow/providers/google/cloud/links/bigquery.py#L26-L35
   
   So, actually there is no reason set as constant, and I could directly set to `format_str`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on PR #24294:
URL: https://github.com/apache/airflow/pull/24294#issuecomment-1150521312

   > I think it is hardly possible because each operator could required each set of Links and not all of them related to Hook
   
   Maybe.  But also, if BaseOperator created a link list containing the link tot he service's main console page, then that list could be extended in the operator if desired.  That would save assigning it in every Operator and only need the declaration in Operators where more than the basic link are desired.
   
   Either way, it's not a blocking suggestion, just a thought.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] vincbeck commented on pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
vincbeck commented on PR #24294:
URL: https://github.com/apache/airflow/pull/24294#issuecomment-1150326725

   Really nice work! Besides @ferruzzi comments, I dont have anything to say other than I like it :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on a diff in pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on code in PR #24294:
URL: https://github.com/apache/airflow/pull/24294#discussion_r892734133


##########
airflow/providers/amazon/aws/links/base.py:
##########
@@ -0,0 +1,87 @@
+#

Review Comment:
   Please rename the file base_aws.py to match the base hook.



##########
airflow/providers/amazon/aws/links/base.py:
##########
@@ -0,0 +1,87 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from typing import TYPE_CHECKING, ClassVar, Optional
+
+from airflow.models import BaseOperatorLink, XCom
+
+if TYPE_CHECKING:
+    from airflow.models import BaseOperator
+    from airflow.models.taskinstance import TaskInstanceKey
+    from airflow.utils.context import Context
+
+
+BASE_AWS_CONSOLE_LINK = "https://console.{aws_domain}"
+
+
+class BaseAwsLink(BaseOperatorLink):
+    """Base Helper class for constructing AWS Console Link"""
+
+    name: ClassVar[str]
+    key: ClassVar[str]
+    format_str: ClassVar[str]
+
+    @staticmethod
+    def get_aws_domain(aws_partition) -> Optional[str]:
+        if aws_partition == "aws":
+            return "aws.amazon.com"
+        elif aws_partition == "aws-cn":
+            return "amazonaws.cn"
+        elif aws_partition == "aws-us-gov":
+            return "amazonaws-us-gov.com"
+
+        return None
+
+    def get_link(
+        self,
+        operator,
+        dttm: Optional[datetime] = None,

Review Comment:
   I'm not crazy about `dttm` as a variable name, it's not immediately clear to me what ti is used for, but if changing that is a breaking change, then I guess we can live with it.



##########
airflow/providers/amazon/aws/links/base.py:
##########
@@ -0,0 +1,87 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from typing import TYPE_CHECKING, ClassVar, Optional
+
+from airflow.models import BaseOperatorLink, XCom
+
+if TYPE_CHECKING:
+    from airflow.models import BaseOperator
+    from airflow.models.taskinstance import TaskInstanceKey
+    from airflow.utils.context import Context
+
+
+BASE_AWS_CONSOLE_LINK = "https://console.{aws_domain}"
+
+
+class BaseAwsLink(BaseOperatorLink):
+    """Base Helper class for constructing AWS Console Link"""
+
+    name: ClassVar[str]
+    key: ClassVar[str]
+    format_str: ClassVar[str]
+
+    @staticmethod
+    def get_aws_domain(aws_partition) -> Optional[str]:
+        if aws_partition == "aws":
+            return "aws.amazon.com"
+        elif aws_partition == "aws-cn":
+            return "amazonaws.cn"
+        elif aws_partition == "aws-us-gov":
+            return "amazonaws-us-gov.com"
+
+        return None
+
+    def get_link(
+        self,
+        operator,
+        dttm: Optional[datetime] = None,
+        ti_key: Optional["TaskInstanceKey"] = None,
+    ) -> str:
+        if ti_key is not None:

Review Comment:
   ```suggestion
           if ti_key:
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] shubham22 commented on pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
shubham22 commented on PR #24294:
URL: https://github.com/apache/airflow/pull/24294#issuecomment-1150369439

   @Taragolis - this is very nice and great design as well! We had in our backlog, but couldn't get to it as we are prioritizing System Test work. Thanks for taking the initiative. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] o-nikolas commented on a diff in pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
o-nikolas commented on code in PR #24294:
URL: https://github.com/apache/airflow/pull/24294#discussion_r892863991


##########
airflow/providers/amazon/aws/links/emr.py:
##########
@@ -0,0 +1,30 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.providers.amazon.aws.links.base_aws import BASE_AWS_CONSOLE_LINK, BaseAwsLink
+
+EMR_CLUSTER_LINK = (

Review Comment:
   Any reason not to define this inline inside the class for `format_str` (it's only used once there). It seems private to the `EmrClusterLink` class and it will get messy if more link classes are added to this module in the future (with all the format strings defined outside the classes).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] o-nikolas commented on a diff in pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
o-nikolas commented on code in PR #24294:
URL: https://github.com/apache/airflow/pull/24294#discussion_r892904177


##########
airflow/providers/amazon/aws/links/emr.py:
##########
@@ -0,0 +1,30 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.providers.amazon.aws.links.base_aws import BASE_AWS_CONSOLE_LINK, BaseAwsLink
+
+EMR_CLUSTER_LINK = (
+    BASE_AWS_CONSOLE_LINK + "/elasticmapreduce/home?region={region_name}#cluster-details:{job_flow_id}"
+)
+
+
+class EmrClusterLink(BaseAwsLink):

Review Comment:
   For sure, generic classes should be defined in one or more modules in `../aws/links/`, if we come to a point where we find that we need such classes. I'm just not sure an extra module for each individual (non-shared) Operator Link class is required, I think the per Operator Link classes could just live in the Operator module itself. But maybe it's clearer to have them all in once place.
   
   I'm happy to approve either way :) 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on a diff in pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
Taragolis commented on code in PR #24294:
URL: https://github.com/apache/airflow/pull/24294#discussion_r892909627


##########
airflow/providers/amazon/aws/links/emr.py:
##########
@@ -0,0 +1,30 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.providers.amazon.aws.links.base_aws import BASE_AWS_CONSOLE_LINK, BaseAwsLink
+
+EMR_CLUSTER_LINK = (
+    BASE_AWS_CONSOLE_LINK + "/elasticmapreduce/home?region={region_name}#cluster-details:{job_flow_id}"
+)
+
+
+class EmrClusterLink(BaseAwsLink):

Review Comment:
   Also I've initially think about add it for sensors however for sensors it would some side effect - for every `poke` it would write to XCom



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on a diff in pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on code in PR #24294:
URL: https://github.com/apache/airflow/pull/24294#discussion_r892769556


##########
airflow/providers/amazon/aws/links/base.py:
##########
@@ -0,0 +1,87 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from typing import TYPE_CHECKING, ClassVar, Optional
+
+from airflow.models import BaseOperatorLink, XCom
+
+if TYPE_CHECKING:
+    from airflow.models import BaseOperator
+    from airflow.models.taskinstance import TaskInstanceKey
+    from airflow.utils.context import Context
+
+
+BASE_AWS_CONSOLE_LINK = "https://console.{aws_domain}"
+
+
+class BaseAwsLink(BaseOperatorLink):
+    """Base Helper class for constructing AWS Console Link"""
+
+    name: ClassVar[str]
+    key: ClassVar[str]
+    format_str: ClassVar[str]
+
+    @staticmethod
+    def get_aws_domain(aws_partition) -> Optional[str]:
+        if aws_partition == "aws":
+            return "aws.amazon.com"
+        elif aws_partition == "aws-cn":
+            return "amazonaws.cn"
+        elif aws_partition == "aws-us-gov":
+            return "amazonaws-us-gov.com"
+
+        return None
+
+    def get_link(
+        self,
+        operator,
+        dttm: Optional[datetime] = None,

Review Comment:
   I'm not crazy about `dttm` as a variable name, it's not immediately clear to me what it is used for, but if changing that is a breaking change, then I guess we can live with it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on a diff in pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
Taragolis commented on code in PR #24294:
URL: https://github.com/apache/airflow/pull/24294#discussion_r892825604


##########
airflow/providers/amazon/aws/links/base.py:
##########
@@ -0,0 +1,87 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from typing import TYPE_CHECKING, ClassVar, Optional
+
+from airflow.models import BaseOperatorLink, XCom
+
+if TYPE_CHECKING:
+    from airflow.models import BaseOperator
+    from airflow.models.taskinstance import TaskInstanceKey
+    from airflow.utils.context import Context
+
+
+BASE_AWS_CONSOLE_LINK = "https://console.{aws_domain}"
+
+
+class BaseAwsLink(BaseOperatorLink):
+    """Base Helper class for constructing AWS Console Link"""
+
+    name: ClassVar[str]
+    key: ClassVar[str]
+    format_str: ClassVar[str]
+
+    @staticmethod
+    def get_aws_domain(aws_partition) -> Optional[str]:
+        if aws_partition == "aws":
+            return "aws.amazon.com"
+        elif aws_partition == "aws-cn":
+            return "amazonaws.cn"
+        elif aws_partition == "aws-us-gov":
+            return "amazonaws-us-gov.com"
+
+        return None
+
+    def get_link(
+        self,
+        operator,
+        dttm: Optional[datetime] = None,
+        ti_key: Optional["TaskInstanceKey"] = None,
+    ) -> str:
+        if ti_key is not None:

Review Comment:
   No way to change it right now
   
   ```
   Check that providers are 2.1 compatible................................................Failed
   - hook id: check-airflow-2-1-compatibility
   - exit code: 1
   
   Found Airflow 2.2 compatibility problems in providers:
   
   In airflow/providers/amazon/aws/links/base.py:65 there is a forbidden construct (Airflow 2.3.0 only):
   
           if ti_key:
               conf = XCom.get_value(key=self.key, ti_key=ti_key)
   
   When you use XCom.get_value( in providers, it should be in the form:
   
   if ti_key is not None:
       value = XCom.get_value(...., ti_key=ti_key)
   
   See: https://airflow.apache.org/docs/apache-airflow-providers/howto/create-update-providers.html#using-providers-with-dynamic-task-mapping
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on a diff in pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
Taragolis commented on code in PR #24294:
URL: https://github.com/apache/airflow/pull/24294#discussion_r892815087


##########
airflow/providers/amazon/aws/links/base.py:
##########
@@ -0,0 +1,87 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from datetime import datetime
+from typing import TYPE_CHECKING, ClassVar, Optional
+
+from airflow.models import BaseOperatorLink, XCom
+
+if TYPE_CHECKING:
+    from airflow.models import BaseOperator
+    from airflow.models.taskinstance import TaskInstanceKey
+    from airflow.utils.context import Context
+
+
+BASE_AWS_CONSOLE_LINK = "https://console.{aws_domain}"
+
+
+class BaseAwsLink(BaseOperatorLink):
+    """Base Helper class for constructing AWS Console Link"""
+
+    name: ClassVar[str]
+    key: ClassVar[str]
+    format_str: ClassVar[str]
+
+    @staticmethod
+    def get_aws_domain(aws_partition) -> Optional[str]:
+        if aws_partition == "aws":
+            return "aws.amazon.com"
+        elif aws_partition == "aws-cn":
+            return "amazonaws.cn"
+        elif aws_partition == "aws-us-gov":
+            return "amazonaws-us-gov.com"
+
+        return None
+
+    def get_link(
+        self,
+        operator,
+        dttm: Optional[datetime] = None,

Review Comment:
   I found this PR #21798 this is uses for support Airflow 2.2 and lower. So as soon as amazon-provider will required 2.3+ we could keep only `ti_key` and `operator`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] github-actions[bot] commented on pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #24294:
URL: https://github.com/apache/airflow/pull/24294#issuecomment-1152975088

   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on PR #24294:
URL: https://github.com/apache/airflow/pull/24294#issuecomment-1153066959

   Yeah, I'm good.  Like I said, it's just something to consider, I didn't want to block the merge over that.  Thanks for checking.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators

Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #24294:
URL: https://github.com/apache/airflow/pull/24294#issuecomment-1152583999

   > Maybe. But also, if BaseOperator created a Link set containing the link to the service's main console page
   
   The implementation of Aws Service link it straightforward
   
   ```python
   import attr
   from airflow.providers.amazon.aws.links.base_aws import BaseAwsLink, BASE_AWS_CONSOLE_LINK
   
   
   @attr.s(auto_attribs=True)
   class AwsServiceLink(BaseAwsLink):
       name: str
       format_str: str
       key = "aws_service"
   
   AwsServiceLink(
       name="AWS EMR Console Link",
       format_str=BASE_AWS_CONSOLE_LINK + "/elasticmapreduce/home?region={region_name}#"
   )
   ```
   
   The is two things which need to be check and solved
   1. Each operator should call `AwsServiceLink.persist` in execute method of operator/sensor
   2. Need to check is it fine to change `operator_extra_links` in runtime
   
   > I'm also not sure what this current implementation is going to look like when you want to add a second/third/nth link.
   
   All operator links predefined in operator in that moment
   
   In execute stage call `persist` method of link with appropriate keywords, which stored in XCom


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org