You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/10/14 18:43:49 UTC

[GitHub] [airflow] potix2 opened a new pull request #11531: Add type hints to aws provider

potix2 opened a new pull request #11531:
URL: https://github.com/apache/airflow/pull/11531


   This PR is about increasing typing coverage for the aws provider. Part of: #9708
   I added type hints to the rest modules of amazon provider.
   
   The typing coverage of amazon provider becomes 77.7.
   
   This PR relates to #11434. 
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r505800924



##########
File path: airflow/providers/amazon/aws/hooks/sagemaker.py
##########
@@ -105,22 +107,14 @@ def secondary_training_status_message(job_description, prev_description):
 
     :return: Job status string to be printed.
     """
-    if (
-        job_description is None
-        or job_description.get('SecondaryStatusTransitions') is None
-        or len(job_description.get('SecondaryStatusTransitions')) == 0
-    ):
+    current_transitions = job_description.get('SecondaryStatusTransitions')
+    if current_transitions is None or len(current_transitions) == 0:
         return ''
 
-    prev_description_secondary_transitions = (
-        prev_description.get('SecondaryStatusTransitions') if prev_description is not None else None
-    )
-    prev_transitions_num = (
-        len(prev_description['SecondaryStatusTransitions'])
-        if prev_description_secondary_transitions is not None
-        else 0
-    )
-    current_transitions = job_description['SecondaryStatusTransitions']
+    prev_transitions_num = 0

Review comment:
       To avoid deeply nested if-statements I rearrange the order of code.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r505808181



##########
File path: airflow/providers/amazon/aws/operators/datasync.py
##########
@@ -307,18 +317,22 @@ def _create_datasync_task(self):
         if not self.task_arn:
             raise AirflowException("Task could not be created")
         self.log.info("Created a Task with TaskArn %s", self.task_arn)
-        return self.task_arn
 
-    def _update_datasync_task(self):
+    def _update_datasync_task(self) -> None:
         """Update a AWS DataSyncTask."""
+        if not self.task_arn:
+            return
+
         hook = self.get_hook()
         self.log.info("Updating TaskArn %s", self.task_arn)
         hook.update_task(self.task_arn, **self.update_task_kwargs)
         self.log.info("Updated TaskArn %s", self.task_arn)
-        return self.task_arn

Review comment:
       This returned value is not used anywhere also.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r505802397



##########
File path: airflow/providers/amazon/aws/hooks/sagemaker.py
##########
@@ -278,7 +272,7 @@ def log_stream(self, log_group, stream_name, start_time=0, skip=0):
 
         return self.logs_hook.get_log_events(log_group, stream_name, start_time, skip)
 
-    def multi_stream_iter(self, log_group, streams, positions=None):
+    def multi_stream_iter(self, log_group: str, streams: list, positions=None) -> Generator:

Review comment:
       It's too hard to add type hint to `positions`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#issuecomment-713822846


   > I rebased but CI looks like hung up... After a few hours I'll try it again.
   
   Everything is green with me. Should I merge it or would you like to make any more changes?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r508883308



##########
File path: airflow/providers/amazon/aws/operators/sagemaker_base.py
##########
@@ -97,3 +97,8 @@ def preprocess_config(self):
 
     def execute(self, context):
         raise NotImplementedError('Please implement execute() in sub class!')
+
+    @cached_property
+    def hook(self):

Review comment:
       Me,too. I would like to rewrite all get_hook() to this way but I think it's out of scope for this work, so I keep the original form.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#issuecomment-713269655


   Can you do a rebase? This is the only thing blocking me from merging this change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r505807619



##########
File path: airflow/providers/amazon/aws/operators/datasync.py
##########
@@ -307,18 +317,22 @@ def _create_datasync_task(self):
         if not self.task_arn:
             raise AirflowException("Task could not be created")
         self.log.info("Created a Task with TaskArn %s", self.task_arn)
-        return self.task_arn

Review comment:
       This returned value is not used anywhere and `_create_datasync_task` is private method.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r505808993



##########
File path: airflow/providers/amazon/aws/operators/glue.py
##########
@@ -87,7 +88,7 @@ def __init__(
         self.s3_bucket = s3_bucket
         self.iam_role_name = iam_role_name
         self.s3_protocol = "s3://"
-        self.s3_artifcats_prefix = 'artifacts/glue-scripts/'
+        self.s3_artifacts_prefix = 'artifacts/glue-scripts/'

Review comment:
       fix typo




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj merged pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
mik-laj merged pull request #11531:
URL: https://github.com/apache/airflow/pull/11531


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r508883308



##########
File path: airflow/providers/amazon/aws/operators/sagemaker_base.py
##########
@@ -97,3 +97,8 @@ def preprocess_config(self):
 
     def execute(self, context):
         raise NotImplementedError('Please implement execute() in sub class!')
+
+    @cached_property
+    def hook(self):

Review comment:
       Me,too. I would like to rewrite all get_hook() to this way but I think it's out of scope this work, so I keep the original form.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mlgruby commented on pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
mlgruby commented on pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#issuecomment-712688010


   Yup I can have a look


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r505790657



##########
File path: tests/providers/amazon/aws/operators/test_batch.py
##########
@@ -86,7 +86,7 @@ def test_init(self):
         self.assertEqual(self.batch.waiters, None)
         self.assertEqual(self.batch.hook.max_retries, self.MAX_RETRIES)
         self.assertEqual(self.batch.hook.status_retries, self.STATUS_RETRIES)
-        self.assertEqual(self.batch.parameters, None)

Review comment:
       We cannot pass `None` as the argument `parameters`. The API document is [here](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/batch.html#Batch.Client.submit_job). 
   
   I checked the behavior of `submit_job` with the following code.
   ```python
   #!/usr/bin/env python
   
   import boto3
   import json
   
   cl = boto3.client('batch')
   ap=None
   params=None
   overrides=None
   resp = cl.submit_job(
           jobName='myjob-1',
           jobQueue='test-batch-job-queue',
           jobDefinition='test',
           arrayProperties=ap,
           parameters=params,
           containerOverrides=overrides,
           )
   print(json.dumps(resp, indent=4, sort_keys=True))
   ```
   
   Result:
   ```
   Traceback (most recent call last):
     File "main.py", line 17, in <module>
       containerOverrides=overrides,
     File "/Users/potix2/.pyenv/versions/3.7.5/lib/python3.7/site-packages/botocore/client.py", line 357, in _api_call
       return self._make_api_call(operation_name, kwargs)
     File "/Users/potix2/.pyenv/versions/3.7.5/lib/python3.7/site-packages/botocore/client.py", line 649, in _make_api_call
       api_params, operation_model, context=request_context)
     File "/Users/potix2/.pyenv/versions/3.7.5/lib/python3.7/site-packages/botocore/client.py", line 697, in _convert_to_request_dict
       api_params, operation_model)
     File "/Users/potix2/.pyenv/versions/3.7.5/lib/python3.7/site-packages/botocore/validate.py", line 297, in serialize_to_request
       raise ParamValidationError(report=report.generate_report())
   botocore.exceptions.ParamValidationError: Parameter validation failed:
   Invalid type for parameter arrayProperties, value: None, type: <class 'NoneType'>, valid types: <class 'dict'>
   Invalid type for parameter parameters, value: None, type: <class 'NoneType'>, valid types: <class 'dict'>
   Invalid type for parameter containerOverrides, value: None, type: <class 'NoneType'>, valid types: <class 'dict'>




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#issuecomment-708607377


   [The Workflow run](https://github.com/apache/airflow/actions/runs/307153993) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#issuecomment-714015305


   @mik-laj It's ready for merging. I overlooked that static checking was successful because `CI Build / Static checks: basic checks only`  was skipped.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r505808378



##########
File path: airflow/providers/amazon/aws/operators/datasync.py
##########
@@ -342,26 +356,27 @@ def _execute_datasync_task(self):
 
         if not result:
             raise AirflowException("Failed TaskExecutionArn %s" % self.task_execution_arn)
-        return self.task_execution_arn

Review comment:
       This one is also.

##########
File path: airflow/providers/amazon/aws/operators/datasync.py
##########
@@ -342,26 +356,27 @@ def _execute_datasync_task(self):
 
         if not result:
             raise AirflowException("Failed TaskExecutionArn %s" % self.task_execution_arn)
-        return self.task_execution_arn
 
-    def on_kill(self):
+    def on_kill(self) -> None:
         """Cancel the submitted DataSync task."""
         hook = self.get_hook()
         if self.task_execution_arn:
             self.log.info("Cancelling TaskExecutionArn %s", self.task_execution_arn)
             hook.cancel_task_execution(task_execution_arn=self.task_execution_arn)
             self.log.info("Cancelled TaskExecutionArn %s", self.task_execution_arn)
 
-    def _delete_datasync_task(self):
+    def _delete_datasync_task(self) -> None:
         """Deletes an AWS DataSync Task."""
+        if not self.task_arn:
+            return
+
         hook = self.get_hook()
         # Delete task:
         self.log.info("Deleting Task with TaskArn %s", self.task_arn)
         hook.delete_task(self.task_arn)
         self.log.info("Task Deleted")
-        return self.task_arn

Review comment:
       This one is also.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r505815593



##########
File path: airflow/providers/google/cloud/operators/bigquery.py
##########
@@ -947,9 +947,7 @@ def execute(self, context) -> None:
                 delegate_to=self.delegate_to,
                 impersonation_chain=self.impersonation_chain,
             )
-            schema_fields = json.loads(
-                gcs_hook.download(gcs_bucket, gcs_object).decode("utf-8")  # type: ignore[attr-defined]
-            )  # type: ignore[attr-defined]
+            schema_fields = json.loads(gcs_hook.download(gcs_bucket, gcs_object))

Review comment:
       The call to `decode` requires `bytes` but the return type of `download` is `Union[str, bytes]`. If we pass `bytes` object to `json.loads`, `decode` is called in `json.loads`, so I think it's better not to care about the returned object's type from `download` in this scope.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#issuecomment-708631694


   [The Workflow run](https://github.com/apache/airflow/actions/runs/307215976) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mlgruby commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
mlgruby commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r508813832



##########
File path: airflow/providers/amazon/aws/sensors/cloud_formation.py
##########
@@ -86,8 +95,10 @@ def poke(self, context):
             return False
         raise ValueError(f'Stack {self.stack_name} in bad state: {stack_status}')
 
-    def get_hook(self):
+    def get_hook(self) -> AWSCloudFormationHook:
         """Create and return an AWSCloudFormationHook"""
-        if not self.hook:
-            self.hook = AWSCloudFormationHook(aws_conn_id=self.aws_conn_id, region_name=self.region_name)
+        if self.hook:

Review comment:
       This `if` is because of `None` default value of `hook` and hence mypy complaining for it? 
   I have seen several other places the same new if statement. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mlgruby commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
mlgruby commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r508807981



##########
File path: airflow/providers/amazon/aws/hooks/sagemaker.py
##########
@@ -918,7 +912,9 @@ def list_processing_jobs(self, **kwargs) -> List[Dict]:  # noqa: D402
         )
         return results
 
-    def _list_request(self, partial_func, result_key: str, max_results: Optional[int] = None) -> List[Dict]:
+    def _list_request(
+        self, partial_func: Callable, result_key: str, max_results: Optional[int] = None
+    ) -> List[Dict]:

Review comment:
       ```suggestion
       ) -> List[dict]:
   ```
   if we are not specifying types for keys and values then let's keep it simple with `List[dict]`

##########
File path: airflow/providers/amazon/aws/sensors/cloud_formation.py
##########
@@ -86,8 +95,10 @@ def poke(self, context):
             return False
         raise ValueError(f'Stack {self.stack_name} in bad state: {stack_status}')
 
-    def get_hook(self):
+    def get_hook(self) -> AWSCloudFormationHook:
         """Create and return an AWSCloudFormationHook"""
-        if not self.hook:
-            self.hook = AWSCloudFormationHook(aws_conn_id=self.aws_conn_id, region_name=self.region_name)
+        if self.hook:

Review comment:
       This `if` is because of `None` default value of `hook` and hence mypy complaining for it? 

##########
File path: airflow/providers/amazon/aws/operators/sagemaker_base.py
##########
@@ -41,12 +43,11 @@ class SageMakerBaseOperator(BaseOperator):
     integer_fields = []  # type: Iterable[Iterable[str]]
 
     @apply_defaults
-    def __init__(self, *, config, aws_conn_id='aws_default', **kwargs):
+    def __init__(self, *, config: dict, aws_conn_id: str = 'aws_default', **kwargs):
         super().__init__(**kwargs)
 
         self.aws_conn_id = aws_conn_id
         self.config = config
-        self.hook = None

Review comment:
       Well, I liked this one :) 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r505805235



##########
File path: airflow/providers/amazon/aws/operators/datasync.py
##########
@@ -165,27 +166,29 @@ def __init__(
             )
 
         # Others
-        self.hook = None
+        self.hook: Optional[AWSDataSyncHook] = None
         # Candidates - these are found in AWS as possible things
         # for us to use
-        self.candidate_source_location_arns = None
-        self.candidate_destination_location_arns = None
-        self.candidate_task_arns = None
+        self.candidate_source_location_arns: Optional[List[str]] = None
+        self.candidate_destination_location_arns: Optional[List[str]] = None
+        self.candidate_task_arns: Optional[List[str]] = None
         # Actuals
-        self.source_location_arn = None
-        self.destination_location_arn = None
-        self.task_execution_arn = None
+        self.source_location_arn: Optional[str] = None
+        self.destination_location_arn: Optional[str] = None
+        self.task_execution_arn: Optional[str] = None
 
-    def get_hook(self):
+    def get_hook(self) -> AWSDataSyncHook:
         """Create and return AWSDataSyncHook.
 
         :return AWSDataSyncHook: An AWSDataSyncHook instance.
         """
-        if not self.hook:

Review comment:
       `self.hook` is initialized by `None` so its type is `Optional[AWSDataSyncHook]`. To eliminate the possibility that `get_hook` returns `None`, we need to rearrange the order of evaluation.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#issuecomment-713821834


   I rebased but CI looks like hung up... After a few hours I'll try it again.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mlgruby commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
mlgruby commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r508819208



##########
File path: airflow/providers/amazon/aws/operators/sagemaker_base.py
##########
@@ -97,3 +97,8 @@ def preprocess_config(self):
 
     def execute(self, context):
         raise NotImplementedError('Please implement execute() in sub class!')
+
+    @cached_property
+    def hook(self):

Review comment:
       Well, I liked this one :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r505812440



##########
File path: airflow/providers/google/cloud/hooks/gcs.py
##########
@@ -260,7 +260,9 @@ def rewrite(
             destination_bucket.name,  # type: ignore[attr-defined]
         )
 
-    def download(self, object_name: str, bucket_name: Optional[str], filename: Optional[str] = None) -> str:
+    def download(
+        self, object_name: str, bucket_name: Optional[str], filename: Optional[str] = None
+    ) -> Union[str, bytes]:

Review comment:
       This method returns downloaded content as `bytes` or filename as `string`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r505798253



##########
File path: airflow/providers/amazon/aws/hooks/sagemaker.py
##########
@@ -105,22 +107,14 @@ def secondary_training_status_message(job_description, prev_description):
 
     :return: Job status string to be printed.
     """
-    if (
-        job_description is None

Review comment:
       `job_description` must not be `None` because its type is not `Optional[T]`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r504923472



##########
File path: airflow/providers/amazon/aws/log/s3_task_handler.py
##########
@@ -146,6 +147,7 @@ def s3_read(self, remote_log_location, return_error=False):
         :param return_error: if True, returns a string error message if an
             error occurs. Otherwise returns '' when an error occurs.
         :type return_error: bool
+        :return the log found at the remote_log_location

Review comment:
       ```suggestion
           :return: the log found at the remote_log_location
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r505801559



##########
File path: airflow/providers/amazon/aws/hooks/sagemaker.py
##########
@@ -309,8 +303,8 @@ def multi_stream_iter(self, log_group, streams, positions=None):
                 events.append(None)
 
         while any(events):
-            i = argmin(events, lambda x: x['timestamp'] if x else 9999999999)
-            yield (i, events[i])
+            i = argmin(events, lambda x: x['timestamp'] if x else 9999999999) or 0

Review comment:
       `argmin`'s type is `Optional[T]` so we need to pass the default value.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#issuecomment-710753161


   I checked failed tests in local but it's successful. I try to trigger actions in CI.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r504924278



##########
File path: airflow/providers/amazon/aws/log/s3_task_handler.py
##########
@@ -146,6 +147,7 @@ def s3_read(self, remote_log_location, return_error=False):
         :param return_error: if True, returns a string error message if an
             error occurs. Otherwise returns '' when an error occurs.
         :type return_error: bool
+        :return the log found at the remote_log_location

Review comment:
       I accepted because I hope this will allow us to fix the documentation.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#issuecomment-713822846


   > I rebased but CI looks like hung up... After a few hours I'll try it again.
   
   Everything is green with me. Should I merge it or would you like to make any more changes?
   <img width="926" alt="Screenshot 2020-10-21 at 21 25 21" src="https://user-images.githubusercontent.com/12058428/96772806-f2fb6e00-13e3-11eb-8935-60a439dad347.png">
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r505797410



##########
File path: airflow/providers/amazon/aws/hooks/glue.py
##########
@@ -97,14 +97,14 @@ def get_iam_execution_role(self) -> Dict:
             self.log.error("Failed to create aws glue job, error: %s", general_error)
             raise
 
-    def initialize_job(self, script_arguments: Optional[List] = None) -> Dict[str, str]:
+    def initialize_job(self, script_arguments: Optional[dict] = None) -> Dict[str, str]:

Review comment:
       `script_arguments` is passed to `Glue.Client.start_job_run` as `Arguments` whose type is `dict`.
   https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/glue.html#Glue.Client.start_job_run




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#issuecomment-712140145


   @mlgruby Can you look at it?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r505810546



##########
File path: airflow/providers/amazon/aws/operators/sagemaker_base.py
##########
@@ -41,12 +43,11 @@ class SageMakerBaseOperator(BaseOperator):
     integer_fields = []  # type: Iterable[Iterable[str]]
 
     @apply_defaults
-    def __init__(self, *, config, aws_conn_id='aws_default', **kwargs):
+    def __init__(self, *, config: dict, aws_conn_id: str = 'aws_default', **kwargs):
         super().__init__(**kwargs)
 
         self.aws_conn_id = aws_conn_id
         self.config = config
-        self.hook = None

Review comment:
       This attribute is directly referenced from child classes. To avoid `None` I defined it as cached_property.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potix2 commented on a change in pull request #11531: Add type hints to aws provider

Posted by GitBox <gi...@apache.org>.
potix2 commented on a change in pull request #11531:
URL: https://github.com/apache/airflow/pull/11531#discussion_r508881355



##########
File path: airflow/providers/amazon/aws/sensors/cloud_formation.py
##########
@@ -86,8 +95,10 @@ def poke(self, context):
             return False
         raise ValueError(f'Stack {self.stack_name} in bad state: {stack_status}')
 
-    def get_hook(self):
+    def get_hook(self) -> AWSCloudFormationHook:
         """Create and return an AWSCloudFormationHook"""
-        if not self.hook:
-            self.hook = AWSCloudFormationHook(aws_conn_id=self.aws_conn_id, region_name=self.region_name)
+        if self.hook:

Review comment:
       Yes. get_hook() returns an actual value, but in the original form mypy hence type error.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org