You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "luos-fc (via GitHub)" <gi...@apache.org> on 2023/07/03 14:35:15 UTC

[GitHub] [airflow] luos-fc opened a new pull request, #32331: Only update crawler tags if present in config dict

luos-fc opened a new pull request, #32331:
URL: https://github.com/apache/airflow/pull/32331

   Closes: #32330 
   
   This PR modifies the GlueCrawlerHook to only update the crawlers tags if the `Tags` key is present in the `config` dict. This prevent situations where tags that are set by external processes are overwritten whenever the crawler is run from Airflow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] utkarsharma2 commented on a diff in pull request #32331: Only update crawler tags if present in config dict

Posted by "utkarsharma2 (via GitHub)" <gi...@apache.org>.
utkarsharma2 commented on code in PR #32331:
URL: https://github.com/apache/airflow/pull/32331#discussion_r1251581477


##########
tests/providers/amazon/aws/hooks/test_glue_crawler.py:
##########
@@ -160,6 +160,18 @@ def test_remove_all_tags(self, mock_get_conn):
             ResourceArn=self.crawler_arn, TagsToRemove=["test", "bar"]
         )
 
+    @mock_sts
+    @mock.patch.object(GlueCrawlerHook, "get_conn")
+    def test_update_missing_tags(self, mock_get_conn):
+        mock_config_missing_tags = deepcopy(mock_config)
+        mock_config_missing_tags.pop("Tags")
+        mock_get_conn.return_value.get_crawler.return_value = {"Crawler": mock_config_missing_tags}
+
+        assert self.hook.update_crawler(**mock_config_missing_tags) is False

Review Comment:
   Should we also add a testcase for when `Tags` key is in the config? Not sure if it's already covered. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Adaverse commented on pull request #32331: Only update crawler tags if present in config dict

Posted by "Adaverse (via GitHub)" <gi...@apache.org>.
Adaverse commented on PR #32331:
URL: https://github.com/apache/airflow/pull/32331#issuecomment-1620679334

   > > Can we also do this cleanup i.e in test `test_replace_tag`, it seems like the below code snippet is just lying there -
   > > ```
   > >         mock_config_two = deepcopy(mock_config)
   > >         mock_config_two.pop("Tags")
   > > ```
   > 
   > what do you mean by lying :) ?
   
   The variable `mock_config_two` isn't being used just defined. Pardon for the typo lying :).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on a diff in pull request #32331: Only update crawler tags if present in config dict

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on code in PR #32331:
URL: https://github.com/apache/airflow/pull/32331#discussion_r1253466086


##########
tests/providers/amazon/aws/hooks/test_glue_crawler.py:
##########
@@ -160,6 +160,18 @@ def test_remove_all_tags(self, mock_get_conn):
             ResourceArn=self.crawler_arn, TagsToRemove=["test", "bar"]
         )
 
+    @mock_sts
+    @mock.patch.object(GlueCrawlerHook, "get_conn")
+    def test_update_missing_tags(self, mock_get_conn):
+        mock_config_missing_tags = deepcopy(mock_config)
+        mock_config_missing_tags.pop("Tags")
+        mock_get_conn.return_value.get_crawler.return_value = {"Crawler": mock_config_missing_tags}
+
+        assert self.hook.update_crawler(**mock_config_missing_tags) is False

Review Comment:
   yeah. Can you double check it @utkarsharma2 ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk merged pull request #32331: Only update crawler tags if present in config dict

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk merged PR #32331:
URL: https://github.com/apache/airflow/pull/32331


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Adaverse commented on pull request #32331: Only update crawler tags if present in config dict

Posted by "Adaverse (via GitHub)" <gi...@apache.org>.
Adaverse commented on PR #32331:
URL: https://github.com/apache/airflow/pull/32331#issuecomment-1620668688

   Can we also do this cleanup i.e in test `test_replace_tag`, it seems like the below code snippet is just lying there - 
   ```
           mock_config_two = deepcopy(mock_config)
           mock_config_two.pop("Tags")
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] luos-fc commented on a diff in pull request #32331: Only update crawler tags if present in config dict

Posted by "luos-fc (via GitHub)" <gi...@apache.org>.
luos-fc commented on code in PR #32331:
URL: https://github.com/apache/airflow/pull/32331#discussion_r1254163820


##########
tests/providers/amazon/aws/hooks/test_glue_crawler.py:
##########
@@ -160,6 +160,18 @@ def test_remove_all_tags(self, mock_get_conn):
             ResourceArn=self.crawler_arn, TagsToRemove=["test", "bar"]
         )
 
+    @mock_sts
+    @mock.patch.object(GlueCrawlerHook, "get_conn")
+    def test_update_missing_tags(self, mock_get_conn):
+        mock_config_missing_tags = deepcopy(mock_config)
+        mock_config_missing_tags.pop("Tags")
+        mock_get_conn.return_value.get_crawler.return_value = {"Crawler": mock_config_missing_tags}
+
+        assert self.hook.update_crawler(**mock_config_missing_tags) is False

Review Comment:
   I removed those two lines



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #32331: Only update crawler tags if present in config dict

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on PR #32331:
URL: https://github.com/apache/airflow/pull/32331#issuecomment-1620674790

   > Can we also do this cleanup i.e in test `test_replace_tag`, it seems like the below code snippet is just lying there -
   > 
   > ```
   >         mock_config_two = deepcopy(mock_config)
   >         mock_config_two.pop("Tags")
   > ```
   
   what do you mean by lying :) ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Adaverse commented on a diff in pull request #32331: Only update crawler tags if present in config dict

Posted by "Adaverse (via GitHub)" <gi...@apache.org>.
Adaverse commented on code in PR #32331:
URL: https://github.com/apache/airflow/pull/32331#discussion_r1251606258


##########
tests/providers/amazon/aws/hooks/test_glue_crawler.py:
##########
@@ -160,6 +160,18 @@ def test_remove_all_tags(self, mock_get_conn):
             ResourceArn=self.crawler_arn, TagsToRemove=["test", "bar"]
         )
 
+    @mock_sts
+    @mock.patch.object(GlueCrawlerHook, "get_conn")
+    def test_update_missing_tags(self, mock_get_conn):
+        mock_config_missing_tags = deepcopy(mock_config)
+        mock_config_missing_tags.pop("Tags")
+        mock_get_conn.return_value.get_crawler.return_value = {"Crawler": mock_config_missing_tags}
+
+        assert self.hook.update_crawler(**mock_config_missing_tags) is False

Review Comment:
   i think `test_replace_tag` function is testing the above case.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] utkarsharma2 commented on a diff in pull request #32331: Only update crawler tags if present in config dict

Posted by "utkarsharma2 (via GitHub)" <gi...@apache.org>.
utkarsharma2 commented on code in PR #32331:
URL: https://github.com/apache/airflow/pull/32331#discussion_r1251581477


##########
tests/providers/amazon/aws/hooks/test_glue_crawler.py:
##########
@@ -160,6 +160,18 @@ def test_remove_all_tags(self, mock_get_conn):
             ResourceArn=self.crawler_arn, TagsToRemove=["test", "bar"]
         )
 
+    @mock_sts
+    @mock.patch.object(GlueCrawlerHook, "get_conn")
+    def test_update_missing_tags(self, mock_get_conn):
+        mock_config_missing_tags = deepcopy(mock_config)
+        mock_config_missing_tags.pop("Tags")
+        mock_get_conn.return_value.get_crawler.return_value = {"Crawler": mock_config_missing_tags}
+
+        assert self.hook.update_crawler(**mock_config_missing_tags) is False

Review Comment:
   Should we also add a testcase for when `Tags` key is in config? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] utkarsharma2 commented on a diff in pull request #32331: Only update crawler tags if present in config dict

Posted by "utkarsharma2 (via GitHub)" <gi...@apache.org>.
utkarsharma2 commented on code in PR #32331:
URL: https://github.com/apache/airflow/pull/32331#discussion_r1253588238


##########
tests/providers/amazon/aws/hooks/test_glue_crawler.py:
##########
@@ -160,6 +160,18 @@ def test_remove_all_tags(self, mock_get_conn):
             ResourceArn=self.crawler_arn, TagsToRemove=["test", "bar"]
         )
 
+    @mock_sts
+    @mock.patch.object(GlueCrawlerHook, "get_conn")
+    def test_update_missing_tags(self, mock_get_conn):
+        mock_config_missing_tags = deepcopy(mock_config)
+        mock_config_missing_tags.pop("Tags")
+        mock_get_conn.return_value.get_crawler.return_value = {"Crawler": mock_config_missing_tags}
+
+        assert self.hook.update_crawler(**mock_config_missing_tags) is False

Review Comment:
   Ya, they are covered. I got confused by these lines, it seems like they don't add any value to the test. Test seems to work even after removing them. 
   https://github.com/apache/airflow/blob/d01248382fe45a5f5a7fdeed4082a80c5f814ad8/tests/providers/amazon/aws/hooks/test_glue_crawler.py#L169-L170
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org