You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/11/16 23:42:09 UTC

[GitHub] [airflow] dsynkov opened a new pull request #12389: [AIRFLOW-5115] Bugfix for S3KeySensor failing to accept template_fields

dsynkov opened a new pull request #12389:
URL: https://github.com/apache/airflow/pull/12389


   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   Fixes a bug in the `S3KeySensor` that prevents the use of Jinja-templated strings as arguments to `bucket_key` and `bucket_name` by performing a URL parsing validation in the constructor (_before_ the template gets rendered).
   
   This PR simply moves this validation downstream to the `poke` method. This unblocks users from using Jinja-templated fields in args to the operator.
   
   To the best of my knowledge there is no corresponding issue in Github but I have linked the Jira issue.
   
   closes: https://issues.apache.org/jira/browse/AIRFLOW-5115
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dsynkov commented on a change in pull request #12389: [AIRFLOW-5115] Bugfix for S3KeySensor failing to accept template_fields

Posted by GitBox <gi...@apache.org>.
dsynkov commented on a change in pull request #12389:
URL: https://github.com/apache/airflow/pull/12389#discussion_r528406944



##########
File path: tests/providers/amazon/aws/sensors/test_s3_key.py
##########
@@ -32,32 +32,40 @@ def test_bucket_name_none_and_bucket_key_as_relative_path(self):
         and bucket_key is provided as relative path rather than s3:// url.
         :return:
         """
+        op = S3KeySensor(task_id='s3_key_sensor', bucket_key="file_in_bucket")
         with self.assertRaises(AirflowException):
-            S3KeySensor(task_id='s3_key_sensor', bucket_key="file_in_bucket")
+            op.poke(None)
 
     def test_bucket_name_provided_and_bucket_key_is_s3_url(self):
         """
         Test if exception is raised when bucket_name is provided
         while bucket_key is provided as a full s3:// url.
         :return:
         """
+        op = S3KeySensor(
+            task_id='s3_key_sensor', bucket_key="s3://test_bucket/file", bucket_name='test_bucket'
+        )
         with self.assertRaises(AirflowException):
-            S3KeySensor(
-                task_id='s3_key_sensor', bucket_key="s3://test_bucket/file", bucket_name='test_bucket'
-            )
+            op.poke(None)
 
     @parameterized.expand(
         [
             ['s3://bucket/key', None, 'key', 'bucket'],
             ['key', 'bucket', 'key', 'bucket'],
         ]
     )
-    def test_parse_bucket_key(self, key, bucket, parsed_key, parsed_bucket):
+    @mock.patch('airflow.providers.amazon.aws.sensors.s3_key.S3Hook')

Review comment:
       Hey @potiuk; just following up re: the above comment. Thanks!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #12389: [AIRFLOW-5115] Bugfix for S3KeySensor failing to accept template_fields

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #12389:
URL: https://github.com/apache/airflow/pull/12389#issuecomment-732003461


   Fantastic! Thnks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #12389: [AIRFLOW-5115] Bugfix for S3KeySensor failing to accept template_fields

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #12389:
URL: https://github.com/apache/airflow/pull/12389#discussion_r525924946



##########
File path: tests/providers/amazon/aws/sensors/test_s3_key.py
##########
@@ -32,32 +32,40 @@ def test_bucket_name_none_and_bucket_key_as_relative_path(self):
         and bucket_key is provided as relative path rather than s3:// url.
         :return:
         """
+        op = S3KeySensor(task_id='s3_key_sensor', bucket_key="file_in_bucket")
         with self.assertRaises(AirflowException):
-            S3KeySensor(task_id='s3_key_sensor', bucket_key="file_in_bucket")
+            op.poke(None)
 
     def test_bucket_name_provided_and_bucket_key_is_s3_url(self):
         """
         Test if exception is raised when bucket_name is provided
         while bucket_key is provided as a full s3:// url.
         :return:
         """
+        op = S3KeySensor(
+            task_id='s3_key_sensor', bucket_key="s3://test_bucket/file", bucket_name='test_bucket'
+        )
         with self.assertRaises(AirflowException):
-            S3KeySensor(
-                task_id='s3_key_sensor', bucket_key="s3://test_bucket/file", bucket_name='test_bucket'
-            )
+            op.poke(None)
 
     @parameterized.expand(
         [
             ['s3://bucket/key', None, 'key', 'bucket'],
             ['key', 'bucket', 'key', 'bucket'],
         ]
     )
-    def test_parse_bucket_key(self, key, bucket, parsed_key, parsed_bucket):
+    @mock.patch('airflow.providers.amazon.aws.sensors.s3_key.S3Hook')

Review comment:
       Would you mind adding a test with JINJA template in ? I think that woudl be an ultimate test here :).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dsynkov commented on a change in pull request #12389: [AIRFLOW-5115] Bugfix for S3KeySensor failing to accept template_fields

Posted by GitBox <gi...@apache.org>.
dsynkov commented on a change in pull request #12389:
URL: https://github.com/apache/airflow/pull/12389#discussion_r526430786



##########
File path: tests/providers/amazon/aws/sensors/test_s3_key.py
##########
@@ -32,32 +32,40 @@ def test_bucket_name_none_and_bucket_key_as_relative_path(self):
         and bucket_key is provided as relative path rather than s3:// url.
         :return:
         """
+        op = S3KeySensor(task_id='s3_key_sensor', bucket_key="file_in_bucket")
         with self.assertRaises(AirflowException):
-            S3KeySensor(task_id='s3_key_sensor', bucket_key="file_in_bucket")
+            op.poke(None)
 
     def test_bucket_name_provided_and_bucket_key_is_s3_url(self):
         """
         Test if exception is raised when bucket_name is provided
         while bucket_key is provided as a full s3:// url.
         :return:
         """
+        op = S3KeySensor(
+            task_id='s3_key_sensor', bucket_key="s3://test_bucket/file", bucket_name='test_bucket'
+        )
         with self.assertRaises(AirflowException):
-            S3KeySensor(
-                task_id='s3_key_sensor', bucket_key="s3://test_bucket/file", bucket_name='test_bucket'
-            )
+            op.poke(None)
 
     @parameterized.expand(
         [
             ['s3://bucket/key', None, 'key', 'bucket'],
             ['key', 'bucket', 'key', 'bucket'],
         ]
     )
-    def test_parse_bucket_key(self, key, bucket, parsed_key, parsed_bucket):
+    @mock.patch('airflow.providers.amazon.aws.sensors.s3_key.S3Hook')

Review comment:
       Thanks; I added it as a separate test since per my understanding it looks like rendering requires creating a `TaskInstance`. Let me know if that's what you had in mind.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] turbaszek commented on a change in pull request #12389: [AIRFLOW-5115] Bugfix for S3KeySensor failing to accept template_fields

Posted by GitBox <gi...@apache.org>.
turbaszek commented on a change in pull request #12389:
URL: https://github.com/apache/airflow/pull/12389#discussion_r526245635



##########
File path: airflow/providers/amazon/aws/sensors/s3_key.py
##########
@@ -69,30 +69,33 @@ def __init__(
         **kwargs,
     ):
         super().__init__(**kwargs)
-        # Parse
-        if bucket_name is None:
-            parsed_url = urlparse(bucket_key)
+
+        self.bucket_name = bucket_name
+        self.bucket_key = bucket_key
+        self.wildcard_match = wildcard_match
+        self.aws_conn_id = aws_conn_id
+        self.verify = verify
+        self.hook: Optional[S3Hook] = None
+
+    def poke(self, context):
+
+        if self.bucket_name is None:
+            parsed_url = urlparse(self.bucket_key)
             if parsed_url.netloc == '':
-                raise AirflowException('Please provide a bucket_name')
-            else:
-                bucket_name = parsed_url.netloc
-                bucket_key = parsed_url.path.lstrip('/')
+                raise AirflowException(
+                    'If key is a relative path from root,' + ' please provide a bucket_name'

Review comment:
       ```suggestion
                       'If key is a relative path from root,  please provide a bucket_name'
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #12389: [AIRFLOW-5115] Bugfix for S3KeySensor failing to accept template_fields

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #12389:
URL: https://github.com/apache/airflow/pull/12389#issuecomment-728644929


   [The Workflow run](https://github.com/apache/airflow/actions/runs/367266746) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dsynkov commented on a change in pull request #12389: [AIRFLOW-5115] Bugfix for S3KeySensor failing to accept template_fields

Posted by GitBox <gi...@apache.org>.
dsynkov commented on a change in pull request #12389:
URL: https://github.com/apache/airflow/pull/12389#discussion_r526361395



##########
File path: airflow/providers/amazon/aws/sensors/s3_key.py
##########
@@ -69,30 +69,33 @@ def __init__(
         **kwargs,
     ):
         super().__init__(**kwargs)
-        # Parse
-        if bucket_name is None:
-            parsed_url = urlparse(bucket_key)
+
+        self.bucket_name = bucket_name
+        self.bucket_key = bucket_key
+        self.wildcard_match = wildcard_match
+        self.aws_conn_id = aws_conn_id
+        self.verify = verify
+        self.hook: Optional[S3Hook] = None
+
+    def poke(self, context):
+
+        if self.bucket_name is None:
+            parsed_url = urlparse(self.bucket_key)
             if parsed_url.netloc == '':
-                raise AirflowException('Please provide a bucket_name')
-            else:
-                bucket_name = parsed_url.netloc
-                bucket_key = parsed_url.path.lstrip('/')
+                raise AirflowException(
+                    'If key is a relative path from root,' + ' please provide a bucket_name'

Review comment:
       Thanks; added!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #12389: [AIRFLOW-5115] Bugfix for S3KeySensor failing to accept template_fields

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #12389:
URL: https://github.com/apache/airflow/pull/12389#issuecomment-732003371


   Awesome work, congrats on your first merged pull request!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #12389: [AIRFLOW-5115] Bugfix for S3KeySensor failing to accept template_fields

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #12389:
URL: https://github.com/apache/airflow/pull/12389#issuecomment-728423059


   Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, pylint and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/master/docs/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze environment](https://github.com/apache/airflow/blob/master/BREEZE.rst) for testing locally, itโ€™s a heavy docker but it ships with a working Airflow and a lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
   - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
   - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it better ๐Ÿš€.
   In case of doubts contact the developers at:
   Mailing List: dev@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #12389: [AIRFLOW-5115] Bugfix for S3KeySensor failing to accept template_fields

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #12389:
URL: https://github.com/apache/airflow/pull/12389


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org