You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/06/24 12:36:45 UTC

[GitHub] [airflow] ecerulm commented on a change in pull request #16631: Change timezone serialization logic

ecerulm commented on a change in pull request #16631:
URL: https://github.com/apache/airflow/pull/16631#discussion_r657903018



##########
File path: tests/serialization/test_dag_serialization.py
##########
@@ -446,6 +447,26 @@ def validate_deserialized_task(
                 datetime(2019, 8, 1, tzinfo=timezone.utc),
             ),
             (pendulum.datetime(2019, 8, 1, tz='UTC'), None, pendulum.datetime(2019, 8, 1, tz='UTC')),
+            (
+                pendulum.parse("2019-08-01T00:00:00.000+00:00"),
+                None,
+                pendulum.parse("2019-08-01T00:00:00.000+00:00"),
+            ),
+            (
+                pendulum.parse("2019-08-01T00:00:00.000+01:30"),
+                None,
+                pendulum.parse("2019-08-01T00:00:00.000+01:30"),
+            ),
+            (
+                pendulum.datetime(2019, 8, 1, tz='Europe/Stockholm'),
+                None,
+                pendulum.datetime(2019, 8, 1, tz='Europe/Stockholm'),
+            ),
+            (
+                pendulum.datetime(2019, 8, 1, tz=FixedTimezone(3600)),
+                None,
+                pendulum.datetime(2019, 8, 1, tz=FixedTimezone(3600)),
+            ),
         ]

Review comment:
       @ashb , any of this added cases with offset timezones "+01:00", `FixedTimezone(3600)`, etc will raise an `InvalidTimezone` exception in the  `SerializedDAG._deserialize_timezone()` at the  `pendulum.timezone(xxx)`. 
   
   Those timezones serialize to  the timezone name which is `+01:00`,etc and `pendulum.timezone('+01:00')` raises `InvalidTimezone` because that method is meant to resolve "named" timezones only (so only works with `UTC`, `Europe/Stockholm`, `EST`, etc  )
   

##########
File path: airflow/serialization/serialized_objects.py
##########
@@ -300,7 +319,18 @@ def _deserialize(cls, encoded_var: Any) -> Any:  # pylint: disable=too-many-retu
             raise TypeError(f'Invalid type {type_!s} in deserialization.')
 
     _deserialize_datetime = pendulum.from_timestamp
-    _deserialize_timezone = pendulum.tz.timezone
+
+    @classmethod
+    def _deserialize_timezone(cls, tz_name: str) -> Timezone:
+        try:
+            return pendulum.tz.timezone(tz_name)
+        except InvalidTimezone:
+            pass
+        try:
+            return pendulum.parse("2021-01-01T12:00:00" + tz_name).tzinfo

Review comment:
       From any of the new cases provided in the tests_dag_serialization.py:test_deserialization_start_date()
   
   But as an example if the DAG start_date is `pendulum.parse("2019-08-01T00:00:00.000+01:00"` then DAG timezone will be a `pendulum.tz.timezone.FixedOffset` with name `+01:00` so that it's what it will be serialized to. 
   
   Now in the deserialization step `pendulum.timezone('+01:00')` will raise a `InvalidTimezone` because that only work for "named" timezones (and not all I must add). See https://github.com/sdispater/pendulum/issues/554 
   
   So if we want to parse `+01:00` back to a timezone object, `pendulum.timezone()` won't do it, we need to "hack it" by creating a "fake" iso 8601 string with the `+01:00` and using `pendulum.parse()` to construct a `FixedTimezone()` object for us. 
   
   I couldn't find any other method in pendulum's to convert the string `'+01:00` to a FixedTimezone/Timezone/`datetime.tzinfo`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org