You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/12/01 18:25:34 UTC

[GitHub] [airflow] Taragolis opened a new pull request, #28039: Migrate amazon provider hooks tests from `unittests` to `pytest`

Taragolis opened a new pull request, #28039:
URL: https://github.com/apache/airflow/pull/28039

   related: #27970
   
   Migrate Amazon provider's hooks and utils tests to `pytest`.
   
   All changes are more or less straightforward:
   - Get rid of `unittests.TestCase` class and **TestCase.assert*** methods
   - Replace decorator `parameterized.expand` by `pytest.mark.parametrize`. I
   - Convert **setUp*** and **tearDown*** methods to [appropriate pytest alternative](https://docs.pytest.org/en/6.2.x/xunit_setup.html#classic-xunit-style-setup)
   
   _See additional findings, info about significant changes and potential follow up in comments_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on a diff in pull request #28039: Migrate amazon provider hooks tests from `unittests` to `pytest`

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on code in PR #28039:
URL: https://github.com/apache/airflow/pull/28039#discussion_r1037539606


##########
tests/providers/amazon/aws/hooks/test_glue_crawler.py:
##########
@@ -83,18 +82,17 @@
 }
 
 
-class TestGlueCrawlerHook(unittest.TestCase):
-    @classmethod
-    def setUp(cls):
-        cls.hook = GlueCrawlerHook(aws_conn_id="aws_default")
+class TestGlueCrawlerHook:
+    def setup_method(self):
+        self.hook = GlueCrawlerHook(aws_conn_id="aws_default")
 
     def test_init(self):
-        self.assertEqual(self.hook.aws_conn_id, "aws_default")
+        assert self.hook.aws_conn_id == "aws_default"
 
     @mock.patch.object(GlueCrawlerHook, "get_conn")

Review Comment:
   (No change requested/required, just chatter)  I mentioned on slack that I had started on some of these; this was the rabbit hole that got me there... I started updating any hooks that still used get_conn() instead of the conn cached property, which led to updating the mocked values in the unit tests, which led to converting them all to pytest which... never got finished :P   So again, thanks for taking this on!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on a diff in pull request #28039: Migrate amazon provider hooks tests from `unittests` to `pytest`

Posted by GitBox <gi...@apache.org>.
Taragolis commented on code in PR #28039:
URL: https://github.com/apache/airflow/pull/28039#discussion_r1037832131


##########
tests/providers/amazon/aws/hooks/test_emr_containers.py:
##########
@@ -110,7 +109,9 @@ def test_query_status_polling_with_timeout(self, mock_session):
         mock_session.return_value = emr_session_mock
         emr_client_mock.describe_job_run.return_value = JOB2_RUN_DESCRIPTION
 
-        query_status = self.emr_containers.poll_query_status(job_id="job123456", max_polling_attempts=2)
+        query_status = self.emr_containers.poll_query_status(
+            job_id="job123456", max_polling_attempts=2, poll_interval=2
+        )

Review Comment:
   Actually we do not need to wait at all but better to use `freezegun` or mock `time.sleep` however  with `poll_interval = 0` it would be run just 0.5 sec (localy).
   
   But also it shows that there is no validation for this value. Usual `0` for any wait interval in general not a good idea,
   because it could easily converted to something like
   
   ```python
   while true:
       pass  # 100% core ๐Ÿ˜ข usage
   ```
   
   Or in this case just exceed the limits of AWS API



##########
tests/providers/amazon/aws/hooks/test_emr_containers.py:
##########
@@ -110,7 +109,9 @@ def test_query_status_polling_with_timeout(self, mock_session):
         mock_session.return_value = emr_session_mock
         emr_client_mock.describe_job_run.return_value = JOB2_RUN_DESCRIPTION
 
-        query_status = self.emr_containers.poll_query_status(job_id="job123456", max_polling_attempts=2)
+        query_status = self.emr_containers.poll_query_status(
+            job_id="job123456", max_polling_attempts=2, poll_interval=2
+        )

Review Comment:
   Actually we do not need to wait at all but better to use `freezegun` or mock `time.sleep` however  with `poll_interval = 0` it would be run just 0.5 sec (localy).
   
   But also it shows that there is no validation for this value. Usual `0` for any wait interval in general not a good idea,
   because it could easily converted to something like
   
   ```python
   while True:
       pass  # 100% core ๐Ÿ˜ข usage
   ```
   
   Or in this case just exceed the limits of AWS API



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on a diff in pull request #28039: Migrate amazon provider hooks tests from `unittests` to `pytest`

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on code in PR #28039:
URL: https://github.com/apache/airflow/pull/28039#discussion_r1038468940


##########
tests/providers/amazon/aws/hooks/test_glacier.py:
##########
@@ -47,25 +47,21 @@ def test_retrieve_inventory_should_return_job_id(self, mock_conn):
         assert job_id == result
 
     @mock.patch("airflow.providers.amazon.aws.hooks.glacier.GlacierHook.get_conn")
-    def test_retrieve_inventory_should_log_mgs(self, mock_conn):
+    def test_retrieve_inventory_should_log_mgs(self, mock_conn, caplog):
         # given

Review Comment:
   Yeah, every time I try do add some new code I find a dozen older bits I'd love to redo in the process.  Like why are we mocking get_conn in every test instead of making one fixture to do it?  I'll take care of that one at some point.  Can't fix it all at once :+1: 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk merged pull request #28039: Migrate amazon provider hooks tests from `unittests` to `pytest`

Posted by GitBox <gi...@apache.org>.
potiuk merged PR #28039:
URL: https://github.com/apache/airflow/pull/28039


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on pull request #28039: Migrate amazon provider hooks tests from `unittests` to `pytest`

Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #28039:
URL: https://github.com/apache/airflow/pull/28039#issuecomment-1334568914

   > nice. I noticed that on some `setup_method`s you added the optional `method` parameter without using it, while you skipped it on others. I think it'd be best to omit it wherever possible (i.e. probably everywhere in this case)
   
   That's funny but if `setup_method` or `teardown_method` decorated by `@mock.patัh` then should use signature (self, method


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] vandonr-amz commented on pull request #28039: Migrate amazon provider hooks tests from `unittests` to `pytest`

Posted by GitBox <gi...@apache.org>.
vandonr-amz commented on PR #28039:
URL: https://github.com/apache/airflow/pull/28039#issuecomment-1334671366

   > > nice. I noticed that on some `setup_method`s you added the optional `method` parameter without using it, while you skipped it on others. I think it'd be best to omit it wherever possible (i.e. probably everywhere in this case)
   > 
   > That's funny but if `setup_method` or `teardown_method` decorated by `@mock.patัh` then method signature (self, method) should use
   
   Ah yes I imagine because of the order of arguments, but I imagine in test_datasync it's not needed ?
   Is it also needed for mocks that "generate" no arguments, like the `@mock.patch.dict` ?
   
   and nit: I like using `_` as name for the arguments that are unused, it's a clear way to mark them as "only here for the compiler to be happy"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on pull request #28039: Migrate amazon provider hooks tests from `unittests` to `pytest`

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on PR #28039:
URL: https://github.com/apache/airflow/pull/28039#issuecomment-1334330677

   That's awesome.  I was literally looking at this earlier this week and have some local code where I started working on it. :+1:   Will review ASAP


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on a diff in pull request #28039: Migrate amazon provider hooks tests from `unittests` to `pytest`

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on code in PR #28039:
URL: https://github.com/apache/airflow/pull/28039#discussion_r1037535947


##########
tests/providers/amazon/aws/hooks/test_glacier.py:
##########
@@ -47,25 +47,21 @@ def test_retrieve_inventory_should_return_job_id(self, mock_conn):
         assert job_id == result
 
     @mock.patch("airflow.providers.amazon.aws.hooks.glacier.GlacierHook.get_conn")
-    def test_retrieve_inventory_should_log_mgs(self, mock_conn):
+    def test_retrieve_inventory_should_log_mgs(self, mock_conn, caplog):
         # given

Review Comment:
   Do we want to remove the `given/when/then` comments in the very few places they still exist while we're at it?  If you don't want to mess with that now, that's understandable too.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] vandonr-amz commented on a diff in pull request #28039: Migrate amazon provider hooks tests from `unittests` to `pytest`

Posted by GitBox <gi...@apache.org>.
vandonr-amz commented on code in PR #28039:
URL: https://github.com/apache/airflow/pull/28039#discussion_r1037610212


##########
tests/providers/amazon/aws/hooks/test_glue_crawler.py:
##########
@@ -83,18 +82,17 @@
 }
 
 
-class TestGlueCrawlerHook(unittest.TestCase):
-    @classmethod
-    def setUp(cls):
-        cls.hook = GlueCrawlerHook(aws_conn_id="aws_default")
+class TestGlueCrawlerHook:
+    def setup_method(self):
+        self.hook = GlueCrawlerHook(aws_conn_id="aws_default")
 
     def test_init(self):
-        self.assertEqual(self.hook.aws_conn_id, "aws_default")
+        assert self.hook.aws_conn_id == "aws_default"
 
     @mock.patch.object(GlueCrawlerHook, "get_conn")
     def test_has_crawler(self, mock_get_conn):
         response = self.hook.has_crawler(mock_crawler_name)
-        self.assertEqual(response, True)
+        assert response

Review Comment:
   nit: I find it more readable to use `assert response is True` (or `is False`) when asserting on boolean values, it reads more like English.



##########
tests/providers/amazon/aws/hooks/test_emr_containers.py:
##########
@@ -110,7 +109,9 @@ def test_query_status_polling_with_timeout(self, mock_session):
         mock_session.return_value = emr_session_mock
         emr_client_mock.describe_job_run.return_value = JOB2_RUN_DESCRIPTION
 
-        query_status = self.emr_containers.poll_query_status(job_id="job123456", max_polling_attempts=2)
+        query_status = self.emr_containers.poll_query_status(
+            job_id="job123456", max_polling_attempts=2, poll_interval=2
+        )

Review Comment:
   nice, but do we need to wait at all ? is `poll_interval = 0` an option ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on a diff in pull request #28039: Migrate amazon provider hooks tests from `unittests` to `pytest`

Posted by GitBox <gi...@apache.org>.
Taragolis commented on code in PR #28039:
URL: https://github.com/apache/airflow/pull/28039#discussion_r1037814650


##########
tests/providers/amazon/aws/hooks/test_batch_client.py:
##########
@@ -299,16 +300,17 @@ def test_job_splunk_logs(self):
                 }
             ]
         }
-        with self.assertLogs(level="WARNING") as capture_logs:
+        with caplog.at_level(level=logging.WARNING):
+            caplog.clear()
             assert self.batch_client.get_job_awslogs_info(JOB_ID) is None
-            assert len(capture_logs.records) == 1
+        assert len(caplog.messages) == 1
 
 
-class TestBatchClientDelays(unittest.TestCase):
+class TestBatchClientDelays:
     @mock.patch.dict("os.environ", AWS_DEFAULT_REGION=AWS_REGION)
     @mock.patch.dict("os.environ", AWS_ACCESS_KEY_ID=AWS_ACCESS_KEY_ID)
     @mock.patch.dict("os.environ", AWS_SECRET_ACCESS_KEY=AWS_SECRET_ACCESS_KEY)
-    def setUp(self):
+    def setup_method(self, method):

Review Comment:
   > Is it also needed for mocks that "generate" no arguments, like the @mock.patch.dict ?
   
   And even with this case we need 2 arguments
   
   ```
   args = (<tests.providers.amazon.aws.hooks.test_batch_client.TestBatchClientDelays object at 0x1071f0f70>, <bound method TestB...tDelays.test_init of <tests.providers.amazon.aws.hooks.test_batch_client.TestBatchClientDelays object at 0x1071f0f70>>)
   kw = {}
   
       @wraps(f)
       def _inner(*args, **kw):
           self._patch_dict()
           try:
   >           return f(*args, **kw)
   E           TypeError: setup_method() takes 1 positional argument but 2 were given
   ```



##########
tests/providers/amazon/aws/hooks/test_glue_crawler.py:
##########
@@ -83,18 +82,17 @@
 }
 
 
-class TestGlueCrawlerHook(unittest.TestCase):
-    @classmethod
-    def setUp(cls):
-        cls.hook = GlueCrawlerHook(aws_conn_id="aws_default")
+class TestGlueCrawlerHook:
+    def setup_method(self):
+        self.hook = GlueCrawlerHook(aws_conn_id="aws_default")
 
     def test_init(self):
-        self.assertEqual(self.hook.aws_conn_id, "aws_default")
+        assert self.hook.aws_conn_id == "aws_default"
 
     @mock.patch.object(GlueCrawlerHook, "get_conn")
     def test_has_crawler(self, mock_get_conn):
         response = self.hook.has_crawler(mock_crawler_name)
-        self.assertEqual(response, True)
+        assert response

Review Comment:
   Good point ๐Ÿ‘ 



##########
tests/providers/amazon/aws/hooks/test_glacier.py:
##########
@@ -47,25 +47,21 @@ def test_retrieve_inventory_should_return_job_id(self, mock_conn):
         assert job_id == result
 
     @mock.patch("airflow.providers.amazon.aws.hooks.glacier.GlacierHook.get_conn")
-    def test_retrieve_inventory_should_log_mgs(self, mock_conn):
+    def test_retrieve_inventory_should_log_mgs(self, mock_conn, caplog):
         # given

Review Comment:
   Right now main target just migrate to `pytests`.
   It is good point to change something from the past ๐Ÿ˜‰
   To be honest a lot of test need to rewrite later anyway



##########
tests/providers/amazon/aws/hooks/test_datasync.py:
##########
@@ -50,21 +49,13 @@ def test_get_conn(self):
 
 @mock_datasync
 @mock.patch.object(DataSyncHook, "get_conn")
-class TestDataSyncHookMocked(unittest.TestCase):
-    def __init__(self, *args, **kwargs):
-        super().__init__(*args, **kwargs)
-        self.source_server_hostname = "host"
-        self.source_subdirectory = "somewhere"
-        self.destination_bucket_name = "my_bucket"
-        self.destination_bucket_dir = "dir"
+class TestDataSyncHookMocked:
+    source_server_hostname = "host"
+    source_subdirectory = "somewhere"
+    destination_bucket_name = "my_bucket"
+    destination_bucket_dir = "dir"
 
-        self.client = None
-        self.hook = None
-        self.source_location_arn = None
-        self.destination_location_arn = None
-        self.task_arn = None
-
-    def setUp(self):
+    def setup_method(self, method):

Review Comment:
   Original @vandonr-amz 
   > Ah yes I imagine because of the order of arguments, but I imagine in test_datasync it's not needed ?
   
   Actually this arg needed because we decorated entire class 
   
   ```
   test setup failed
   args = (<tests.providers.amazon.aws.hooks.test_datasync.TestDataSyncHookMocked object at 0x109fdf4c0>, <bound method TestData...HookMocked.test_init of <tests.providers.amazon.aws.hooks.test_datasync.TestDataSyncHookMocked object at 0x109fdf4c0>>)
   kwargs = {}
   
       def wrapper(*args, **kwargs):
           self.start(reset=reset)
           try:
   >           result = func(*args, **kwargs)
   E           TypeError: setup_method() takes 1 positional argument but 2 were given
   ```



##########
tests/providers/amazon/aws/hooks/test_batch_waiters.py:
##########
@@ -317,12 +316,12 @@ def test_batch_job_waiting(aws_clients, aws_region, job_queue_name, job_definiti
     assert job_status == "SUCCEEDED"
 
 
-class TestBatchWaiters(unittest.TestCase):
+class TestBatchWaiters:
     @mock.patch.dict("os.environ", AWS_DEFAULT_REGION=AWS_REGION)
     @mock.patch.dict("os.environ", AWS_ACCESS_KEY_ID=AWS_ACCESS_KEY_ID)
     @mock.patch.dict("os.environ", AWS_SECRET_ACCESS_KEY=AWS_SECRET_ACCESS_KEY)
     @mock.patch("airflow.providers.amazon.aws.hooks.batch_client.AwsBaseHook.get_client_type")
-    def setUp(self, get_client_type_mock):
+    def setup_method(self, method, get_client_type_mock):

Review Comment:
   Original @vandonr-amz
   
   > and nit: I like using _ as name for the arguments that are unused, it's a clear way to mark them as "only here for the compiler to be happy"
   
   This mostly as reminder what the [actual argument here](https://docs.pytest.org/en/6.2.x/xunit_setup.html#method-and-function-level-setup-teardown). 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on a diff in pull request #28039: Migrate amazon provider hooks tests from `unittests` to `pytest`

Posted by GitBox <gi...@apache.org>.
Taragolis commented on code in PR #28039:
URL: https://github.com/apache/airflow/pull/28039#discussion_r1037439960


##########
tests/providers/amazon/aws/hooks/test_emr_containers.py:
##########
@@ -110,7 +109,9 @@ def test_query_status_polling_with_timeout(self, mock_session):
         mock_session.return_value = emr_session_mock
         emr_client_mock.describe_job_run.return_value = JOB2_RUN_DESCRIPTION
 
-        query_status = self.emr_containers.poll_query_status(job_id="job123456", max_polling_attempts=2)
+        query_status = self.emr_containers.poll_query_status(
+            job_id="job123456", max_polling_attempts=2, poll_interval=2
+        )

Review Comment:
   This simple change actually speedup test execution from 30 sec to 2 seconds
   
   _before_
   ```
   ============================ slowest 100 durations =============================
     30.03s call     tests/providers/amazon/aws/hooks/test_emr_containers.py::TestEmrContainerHook::test_query_status_polling_with_timeout
   ```
   
   _after_
   ```
   2.01s call     tests/providers/amazon/aws/hooks/test_emr_containers.py::TestEmrContainerHook::test_query_status_polling_with_timeout
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on a diff in pull request #28039: Migrate amazon provider hooks tests from `unittests` to `pytest`

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on code in PR #28039:
URL: https://github.com/apache/airflow/pull/28039#discussion_r1037539606


##########
tests/providers/amazon/aws/hooks/test_glue_crawler.py:
##########
@@ -83,18 +82,17 @@
 }
 
 
-class TestGlueCrawlerHook(unittest.TestCase):
-    @classmethod
-    def setUp(cls):
-        cls.hook = GlueCrawlerHook(aws_conn_id="aws_default")
+class TestGlueCrawlerHook:
+    def setup_method(self):
+        self.hook = GlueCrawlerHook(aws_conn_id="aws_default")
 
     def test_init(self):
-        self.assertEqual(self.hook.aws_conn_id, "aws_default")
+        assert self.hook.aws_conn_id == "aws_default"
 
     @mock.patch.object(GlueCrawlerHook, "get_conn")

Review Comment:
   I mentioned on slack that I had started on some of these; this was the rabbit hole that got me there... I started updating any hooks that still used get_conn() instead of the conn cached property, which led to updating the mocked values in the unit tests, which led to converting them all to pytest which... never got finished :P   So again, thanks for taking this on!



##########
tests/providers/amazon/aws/hooks/test_glacier.py:
##########
@@ -47,25 +47,21 @@ def test_retrieve_inventory_should_return_job_id(self, mock_conn):
         assert job_id == result
 
     @mock.patch("airflow.providers.amazon.aws.hooks.glacier.GlacierHook.get_conn")
-    def test_retrieve_inventory_should_log_mgs(self, mock_conn):
+    def test_retrieve_inventory_should_log_mgs(self, mock_conn, caplog):
         # given

Review Comment:
   Do we want to remove the `given/when/then` comments in the very few places they still exist while we're at it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis closed pull request #28039: Migrate amazon provider hooks tests from `unittests` to `pytest`

Posted by GitBox <gi...@apache.org>.
Taragolis closed pull request #28039: Migrate amazon provider hooks tests from `unittests` to `pytest`
URL: https://github.com/apache/airflow/pull/28039


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org