You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/12/15 00:24:30 UTC

[GitHub] [airflow] khalidmammadov opened a new pull request #20306: Fixing MyPy issues inside airflow/serialization and airflow/sensors

khalidmammadov opened a new pull request #20306:
URL: https://github.com/apache/airflow/pull/20306


   Part of: #19891
   
   Fixing MyPy issues inside airflow/serialization and airflow/sensors:
   
   airflow/serialization
   - Change reference to timezone path using absolute path
   - Handle None case for Optional type in return statement for _get_registered_timetable function
   - Fix type signature for _serialize_params_dict inline with changes made https://github.com/apache/airflow/commit/7d8e3b828af0ac90261c341f5cb0e57da75e6a83#
   	in https://github.com/apache/airflow/blob/7d8e3b828af0ac90261c341f5cb0e57da75e6a83/airflow/models/baseoperator.py#L649
   - Ignore error when MyPy can't detect type correctly for task_dict[val] returned value
   
   airflow/sensors
   - Use Collection type to handle "Sized" error for len function argument when Iterable used
   - Provide "hint" to MyPy that List is also Iterable for total_states variable
   - Provide type for get_count function arguments
   - Wrap execution _handle_execution_date_fn around if statement to ensure it's only executed when required "function" and "context" variables are present
   
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] khalidmammadov commented on a change in pull request #20306: Fixing MyPy issues inside airflow/serialization and airflow/sensors

Posted by GitBox <gi...@apache.org>.
khalidmammadov commented on a change in pull request #20306:
URL: https://github.com/apache/airflow/pull/20306#discussion_r769465486



##########
File path: airflow/serialization/serialized_objects.py
##########
@@ -118,7 +118,7 @@ def encode_timezone(var: Timezone) -> Union[str, int]:
 
 def decode_timezone(var: Union[str, int]) -> Timezone:
     """Decode a previously serialized Pendulum Timezone."""
-    return pendulum.timezone(var)
+    return pendulum.tz.timezone(var)

Review comment:
       I know it's exposed in the __init__.py but it is saying that pendulum is not callable for some reason. This solved that issue and I checked across airflow project it's used in a both ways.
   Actual error:
   ```
   airflow/serialization/serialized_objects.py:121: error: Module not callable
           return pendulum.timezone(var)
                  ^
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #20306: Fixing MyPy issues inside airflow/serialization

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #20306:
URL: https://github.com/apache/airflow/pull/20306


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #20306: Fixing MyPy issues inside airflow/serialization and airflow/sensors

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #20306:
URL: https://github.com/apache/airflow/pull/20306#issuecomment-997233591


   Conflicts need rebasing!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #20306: Fixing MyPy issues inside airflow/serialization and airflow/sensors

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #20306:
URL: https://github.com/apache/airflow/pull/20306#discussion_r769246470



##########
File path: airflow/sensors/external_task.py
##########
@@ -258,16 +259,17 @@ def _handle_execution_date_fn(self, context) -> Any:
         implementation to pass all context variables as keyword arguments, to allow
         for more sophisticated returns of dates to return.
         """
-        from airflow.utils.operator_helpers import make_kwargs_callable
-
-        # Remove "execution_date" because it is already a mandatory positional argument
-        execution_date = context["execution_date"]
-        kwargs = {k: v for k, v in context.items() if k != "execution_date"}
-        # Add "context" in the kwargs for backward compatibility (because context used to be
-        # an acceptable argument of execution_date_fn)
-        kwargs["context"] = context
-        kwargs_callable = make_kwargs_callable(self.execution_date_fn)
-        return kwargs_callable(execution_date, **kwargs)
+        if self.execution_date_fn and context:
+            from airflow.utils.operator_helpers import make_kwargs_callable
+
+            # Remove "execution_date" because it is already a mandatory positional argument
+            execution_date = context["execution_date"]
+            kwargs = {k: v for k, v in context.items() if k != "execution_date"}
+            # Add "context" in the kwargs for backward compatibility (because context used to be
+            # an acceptable argument of execution_date_fn)
+            kwargs["context"] = context
+            kwargs_callable = make_kwargs_callable(self.execution_date_fn)
+            return kwargs_callable(execution_date, **kwargs)

Review comment:
       Better to raise an exception on `not self.execution_date_fn or not context` instead of returning None.

##########
File path: airflow/sensors/external_task.py
##########
@@ -108,7 +109,7 @@ def __init__(
         self.allowed_states = list(allowed_states) if allowed_states else [State.SUCCESS]
         self.failed_states = list(failed_states) if failed_states else []
 
-        total_states = self.allowed_states + self.failed_states
+        total_states: Iterable = self.allowed_states + self.failed_states

Review comment:
       ```suggestion
           total_states: Iterable[str] = self.allowed_states + self.failed_states
   ```

##########
File path: airflow/serialization/serialized_objects.py
##########
@@ -118,7 +118,7 @@ def encode_timezone(var: Timezone) -> Union[str, int]:
 
 def decode_timezone(var: Union[str, int]) -> Timezone:
     """Decode a previously serialized Pendulum Timezone."""
-    return pendulum.timezone(var)
+    return pendulum.tz.timezone(var)

Review comment:
       What is the error here? `timezone` should be exposed as a top-level function.

##########
File path: airflow/sensors/external_task.py
##########
@@ -213,7 +214,7 @@ def _check_for_existence(self, session) -> None:
                     )
         self._has_checked_existence = True
 
-    def get_count(self, dttm_filter, session, states) -> int:
+    def get_count(self, dttm_filter: Iterable, session: Session, states: Iterable) -> int:

Review comment:
       Iterable of what?

##########
File path: airflow/serialization/serialized_objects.py
##########
@@ -127,7 +127,10 @@ def _get_registered_timetable(importable_string: str) -> Optional[Type[Timetable
     if importable_string.startswith("airflow.timetables."):
         return import_string(importable_string)
     plugins_manager.initialize_timetables_plugins()
-    return plugins_manager.timetable_classes.get(importable_string)
+    if plugins_manager.timetable_classes:
+        return plugins_manager.timetable_classes.get(importable_string)
+    else:
+        return None

Review comment:
       We should change these global variables to remove the `Optional` case.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] khalidmammadov commented on a change in pull request #20306: Fixing MyPy issues inside airflow/serialization and airflow/sensors

Posted by GitBox <gi...@apache.org>.
khalidmammadov commented on a change in pull request #20306:
URL: https://github.com/apache/airflow/pull/20306#discussion_r769448128



##########
File path: airflow/sensors/external_task.py
##########
@@ -213,7 +214,7 @@ def _check_for_existence(self, session) -> None:
                     )
         self._has_checked_existence = True
 
-    def get_count(self, dttm_filter, session, states) -> int:
+    def get_count(self, dttm_filter: Iterable, session: Session, states: Iterable) -> int:

Review comment:
       added




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #20306: Fixing MyPy issues inside airflow/serialization and airflow/sensors

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #20306:
URL: https://github.com/apache/airflow/pull/20306#discussion_r769469899



##########
File path: airflow/serialization/serialized_objects.py
##########
@@ -118,7 +118,7 @@ def encode_timezone(var: Timezone) -> Union[str, int]:
 
 def decode_timezone(var: Union[str, int]) -> Timezone:
     """Decode a previously serialized Pendulum Timezone."""
-    return pendulum.timezone(var)
+    return pendulum.tz.timezone(var)

Review comment:
       Actually I think it’s complaining that `pendulum.timezone` is not callable (because there is actually a `pendulum.timezone` package and that got Mypy confused). Pendulum’s package structure is really weird :(




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] khalidmammadov commented on pull request #20306: Fixing MyPy issues inside airflow/serialization

Posted by GitBox <gi...@apache.org>.
khalidmammadov commented on pull request #20306:
URL: https://github.com/apache/airflow/pull/20306#issuecomment-997284799


   @potiuk Fixed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] khalidmammadov commented on a change in pull request #20306: Fixing MyPy issues inside airflow/serialization and airflow/sensors

Posted by GitBox <gi...@apache.org>.
khalidmammadov commented on a change in pull request #20306:
URL: https://github.com/apache/airflow/pull/20306#discussion_r769481378



##########
File path: airflow/sensors/external_task.py
##########
@@ -258,16 +259,17 @@ def _handle_execution_date_fn(self, context) -> Any:
         implementation to pass all context variables as keyword arguments, to allow
         for more sophisticated returns of dates to return.
         """
-        from airflow.utils.operator_helpers import make_kwargs_callable
-
-        # Remove "execution_date" because it is already a mandatory positional argument
-        execution_date = context["execution_date"]
-        kwargs = {k: v for k, v in context.items() if k != "execution_date"}
-        # Add "context" in the kwargs for backward compatibility (because context used to be
-        # an acceptable argument of execution_date_fn)
-        kwargs["context"] = context
-        kwargs_callable = make_kwargs_callable(self.execution_date_fn)
-        return kwargs_callable(execution_date, **kwargs)
+        if self.execution_date_fn and context:
+            from airflow.utils.operator_helpers import make_kwargs_callable
+
+            # Remove "execution_date" because it is already a mandatory positional argument
+            execution_date = context["execution_date"]
+            kwargs = {k: v for k, v in context.items() if k != "execution_date"}
+            # Add "context" in the kwargs for backward compatibility (because context used to be
+            # an acceptable argument of execution_date_fn)
+            kwargs["context"] = context
+            kwargs_callable = make_kwargs_callable(self.execution_date_fn)
+            return kwargs_callable(execution_date, **kwargs)

Review comment:
       Fixed, please check if Ok?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] khalidmammadov commented on a change in pull request #20306: Fixing MyPy issues inside airflow/serialization and airflow/sensors

Posted by GitBox <gi...@apache.org>.
khalidmammadov commented on a change in pull request #20306:
URL: https://github.com/apache/airflow/pull/20306#discussion_r769564838



##########
File path: airflow/serialization/serialized_objects.py
##########
@@ -127,7 +127,10 @@ def _get_registered_timetable(importable_string: str) -> Optional[Type[Timetable
     if importable_string.startswith("airflow.timetables."):
         return import_string(importable_string)
     plugins_manager.initialize_timetables_plugins()
-    return plugins_manager.timetable_classes.get(importable_string)
+    if plugins_manager.timetable_classes:
+        return plugins_manager.timetable_classes.get(importable_string)
+    else:
+        return None

Review comment:
       Did look into this in detail, those global variables (plugins_manager.timetable_classes etc.) are defined for plugins and if they are not provided/loaded/exists then they would be None. 
   In general, these initializers (initialize_timetables_plugins(), integrate_executor_plugins() etc.) could be refactored to do get_something rather than using global variables but that's a bit larger change.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] khalidmammadov commented on a change in pull request #20306: Fixing MyPy issues inside airflow/serialization and airflow/sensors

Posted by GitBox <gi...@apache.org>.
khalidmammadov commented on a change in pull request #20306:
URL: https://github.com/apache/airflow/pull/20306#discussion_r769465486



##########
File path: airflow/serialization/serialized_objects.py
##########
@@ -118,7 +118,7 @@ def encode_timezone(var: Timezone) -> Union[str, int]:
 
 def decode_timezone(var: Union[str, int]) -> Timezone:
     """Decode a previously serialized Pendulum Timezone."""
-    return pendulum.timezone(var)
+    return pendulum.tz.timezone(var)

Review comment:
       I know it's exposed in the `__init__`.py but it is saying that pendulum is not callable for some reason. This solved that issue and I checked across airflow project it's used in a both ways.
   Actual error:
   ```
   airflow/serialization/serialized_objects.py:121: error: Module not callable
           return pendulum.timezone(var)
                  ^
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org