You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/06/20 17:49:26 UTC

[GitHub] [airflow] ecerulm opened a new issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

ecerulm opened a new issue #16551:
URL: https://github.com/apache/airflow/issues/16551


   <!--
   
   Welcome to Apache Airflow!  For a smooth issue process, try to answer the following questions.
   Don't worry if they're not all applicable; just try to include what you can :-)
   
   If you need to include code snippets or logs, please put them in fenced code
   blocks.  If they're super-long, please use the details tag like
   <details><summary>super-long log</summary> lots of stuff </details>
   
   Please delete these comment blocks before submitting the issue.
   
   -->
   
   <!--
   
   IMPORTANT!!!
   
   PLEASE CHECK "SIMILAR TO X EXISTING ISSUES" OPTION IF VISIBLE
   NEXT TO "SUBMIT NEW ISSUE" BUTTON!!!
   
   PLEASE CHECK IF THIS ISSUE HAS BEEN REPORTED PREVIOUSLY USING SEARCH!!!
   
   Please complete the next sections or the issue will be closed.
   These questions are the first thing we need to know to understand the context.
   
   -->
   
   **Apache Airflow version**: 2.0.2
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**:
   - **OS** (e.g. from /etc/os-release):
   - **Kernel** (e.g. `uname -a`):
   - **Install tools**:
   - **Others**:
   
   **What happened**:
   
   In a DAG with `datetime(2021, 5, 31, tzinfo=timezone.utc)` it will raise an `AttributeError: 'datetime.timezone' object has no attribute 'name'` in the scheduler. 
   
   It seems that airflow relies on the tzinfo object to have a  `.name` attribute so the "canonical" `datetime.timezone.utc` does not comply with that requirement. 
   
   
   
   ```
   AttributeError: 'datetime.timezone' object has no attribute 'name'
   Process DagFileProcessor302-Process:
   Traceback (most recent call last):
     File "/usr/local/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
       self.run()
     File "/usr/local/lib/python3.8/multiprocessing/process.py", line 108, in run
       self._target(*self._args, **self._kwargs)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 184, in _run_file_processor
       result: Tuple[int, int] = dag_file_processor.process_file(
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/session.py", line 70, in wrapper
       return func(*args, session=session, **kwargs)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 648, in process_file
       dagbag.sync_to_db()
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/session.py", line 70, in wrapper
       return func(*args, session=session, **kwargs)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/dagbag.py", line 556, in sync_to_db
       for attempt in run_with_db_retries(logger=self.log):
     File "/home/airflow/.local/lib/python3.8/site-packages/tenacity/__init__.py", line 390, in __iter__
       do = self.iter(retry_state=retry_state)
     File "/home/airflow/.local/lib/python3.8/site-packages/tenacity/__init__.py", line 356, in iter
       return fut.result()
     File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 437, in result
       return self.__get_result()
     File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
       raise self._exception
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/dagbag.py", line 570, in sync_to_db
       DAG.bulk_write_to_db(self.dags.values(), session=session)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/session.py", line 67, in wrapper
       return func(*args, **kwargs)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/dag.py", line 1892, in bulk_write_to_db
       orm_dag.calculate_dagrun_date_fields(
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/dag.py", line 2268, in calculate_dagrun_date_fields
       self.next_dagrun, self.next_dagrun_create_after = dag.next_dagrun_info(most_recent_dag_run)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/dag.py", line 536, in next_dagrun_info
       next_execution_date = self.next_dagrun_after_date(date_last_automated_dagrun)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/dag.py", line 571, in next_dagrun_after_date
       next_start = self.following_schedule(now)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/dag.py", line 485, in following_schedule
       tz = pendulum.timezone(self.timezone.name)
   AttributeError: 'datetime.timezone' object has no attribute 'name'
   ```
   **What you expected to happen**:
   
   If `start_date` or any other input parameter requires a `tzinfo` with a `name` attribute it should  check for that  in the DAG object and produce a more specific error message not `AttributeError`. 
   
   Also I guess this requirement should be explicitly mentioned in https://airflow.apache.org/docs/apache-airflow/stable/timezone.html with a comment like
   ```
     you can't use datetime.timezone.utc because it does not have a name attribute
   ```
   
   Or even better it would be not to rely on the presence of a `name` attribute in the tzinfo....
   
   
   **How to reproduce it**:
   <!---
   
   As minimally and precisely as possible. Keep in mind we do not have access to your cluster or dags.
   
   If you are using kubernetes, please attempt to recreate the issue using minikube or kind.
   
   ## Install minikube/kind
   
   - Minikube https://minikube.sigs.k8s.io/docs/start/
   - Kind https://kind.sigs.k8s.io/docs/user/quick-start/
   
   If this is a UI bug, please provide a screenshot of the bug or a link to a youtube video of the bug in action
   
   You can include images using the .md style of
   ![alt text](http://url/to/img.png)
   
   To record a screencast, mac users can use QuickTime and then create an unlisted youtube video with the resulting .mov file.
   
   --->
   
   ```
   from datetime import timedelta, datetime, timezone
   args = {
       "owner": "airflow",
       "retries": 3,
   }
   dag = DAG(
       dag_id="xxxx",
       default_args=args,
       start_date=datetime(2021, 5, 31, tzinfo=timezone.utc),
       schedule_interval="0 8 * * *",
       max_active_runs=1,
       dagrun_timeout=timedelta(minutes=60),
       catchup=False,
       description="xxxxx",
   )
   
   ```
   
   **Anything else we need to know**:
   
   
   <!--
   
   How often does this problem occur? Once? Every time etc?
   
   Any relevant logs to include? Put them here in side a detail tag:
   <details><summary>x.log</summary> lots of stuff </details>
   
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-864864317


   > I mean I had (always) used `datetime(2021, 5, 31, tzinfo=timezone.utc)` before and I didn't get this.
   
   Not sure when do you mean by “before”, but the `tzinfo.name` call is added three years ago so it’s not exactly recent. But I guess that’s subjective. (FWIW the previous implementation didn’t error because it didn’t checked the timezone at all and was buggy on non-fixed timezones).
   
   > I honestly think we should completely get rid of Pendulum.
   
   That’s probably a good idea long term, but we’ll need some planning on this to re-write timezone functionalities in scheduling, which are heavily built around Pendulum right now. IMO unifying everything to Peendulum can likely help with the transition *off* Pendulum as well, if we only use the standard stuff (e.g. no `Timezone.name`, only `tzinfo.tzname()`, things like that).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ecerulm commented on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
ecerulm commented on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-866185162


   So it seems that currently the requirements for the `dag.timezone` are
   * it has to have a `name` attribute , so it must be a `pendulum.tz.timezone.*`
   * It has to have a `name`  that resolve to a "named timezone" like (`Europe/Stockholm`, `UTC`, etc)  and it can't be a FixedOffset like `+00:00`,  this show in serialization/deserialization for example. You can serialize the dag but it won't be able to deserialize because it can parse back the `+00:00` timezone. (It will eventually call `pendulum.tz.timezone('+00:00')` and that raises `InvalidTimezone`
   
   So I don't know, I can sure make it work for `tzinfo=timezone.utc` but not all `tzinfo` can be converted to a "named timezone". 
   
   What I can do it's to do a fail-fast approach in `DAG.__init__()` by doing the `self.timezone = pendulum.timezone(self.timezone.name)` there instead of waiting until `following_schedule`, `previous_schedule` or a deserialization to occur. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-864807559






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ecerulm commented on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
ecerulm commented on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-864836448


   > I think the timezone argument was really designed to only except pendulum.tz.Timezone, not datetime.timezone (or dateutil.tz.* for that matter).
   
   I can't guarantee it but I think this is a regression, I mean I had (always) used `datetime(2021, 5, 31, tzinfo=timezone.utc)` before and I didn't get this. 
   
   But anyway I agree that you  that probably you can quickly solve this particular issue and prevent lots of confused users by `start_date = pendulum.instance(start_date)`, etc in the constructors that take datetimes.
   
   If checked and converting to pendulum datetime provides a `tzinfo` with a `name`:
   ```  
   pendulum.instance(datetime(2021, 5, 31, tzinfo=timezone.utc)).tzinfo.name # '+00:00'
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr edited a comment on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-866411375


   If we coerce the datetime object in `__init__`, we can remove the later timezone type coercion we have right now. `pendulum.instance(start_date).tzinfo` is already a `pendulum.Timezone` and don't need to be converted again.
   
   (IOW you did the right thing removing it in the PR)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-864897955


   > I honestly think we should completely get rid of Pendulum.
   
   This will be kinda painful I think since some of the most important macros are Pendulum and it's very common to do manipulation over it so this may require significant code changes per dag when upgrading version


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ecerulm commented on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
ecerulm commented on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-866114299


   Before I go any further: 
   
   The current error comes from `tz = pendulum.timezone(self.timezone.name)` at [dag.py](https://github.com/apache/airflow/blob/2a59de3e558e3b60caad876dee8fa4b43a7a17cf/airflow/models/dag.py#L506-L507)
   
   If we just convert the `start_date` to pendulum with `pendulum.instance(start_date)` it fail instead with a `pendulum.tz.zoneinfo.exceptions.InvalidTimezone: Invalid timezone "+00:00"`  the timezone name will be `+00:00` and `pendulum.timezone("+00:00")` fails. This really surprised me. 
   
   
   So there are some alternatives:
   
   * Use `pendulum.instance()` , let the `self.timezone` be `pendulum.tz.timezone.FixedOffset()`/`TimeZone('+00:00')` and replace the `tz = pendulum.timezone(self.timezone.name)` with just `tz = self.timezone`
   * Just do `self.timezone = pendulum._safe_timezone(self.timezone)` that will convert `timezone.utc` into `TimeZone('UTC')`, but that is an **undocumented/private** function of pendulum.
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ecerulm commented on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
ecerulm commented on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-865039610


   If you assign this to me I can do a PR that converts the input  `datetime.datetime` into pendulm equivalents. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-864832581






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil closed issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
kaxil closed issue #16551:
URL: https://github.com/apache/airflow/issues/16551


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-866411375


   If we coerce the datetime object in `__init__`, we can remove the later timezone type coercion we have right now. `pendulum.instance(start_date).tzinfo` is already a `pendulum.Timezone` and don't need to be converted again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-864807559


   I think the timezone argument was really designed to only except `pendulum.tz.Timezone`, not `datetime.timezone` (or `dateutil.tz.*` for that matter). We can probably fix this by converting `start_date`, `end_date` etc. on `DAG` and `BaseOperator` (any others? `TaskGroup`?) to `pendulum.DateTime`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ecerulm commented on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
ecerulm commented on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-866405944


   
   > Where exactly is this raised? `pendulum.instance()` seems to work with `datetime.timezone` for me:
   
   It will be raised later at `pendulum.timezone(self.time zone.name)`
   
   So the following will raise InvalidTimezone:
   ```
   pendulum.timezone(pendulum.instance(datetime(2021,6,1,tzinfo=timezone.utc)).tzinfo.name)
   ```
   
   because the pendulum's instance timezone will be Timezone("+00:00") with `.name=="+00:00"` and that's not a valid timezone name according to `pendulum.timezone()`.
   
   So in essence `pendulum.timezone("+00:00")` raises the exception. 
   
   But anyway I just removed the usage of `pendulum.timezone(self.timezone.name)` in the PR.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ecerulm commented on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
ecerulm commented on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-864836448






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-865071593


   assigned :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-866364051


   > If we just convert the `start_date` to pendulum with `pendulum.instance(start_date)` it fail instead with a `pendulum.tz.zoneinfo.exceptions.InvalidTimezone: Invalid timezone "+00:00"`
   
   Where exactly is this raised? `pendulum.instance()` seems to work with `datetime.timezone` for me:
   
   ```pycon
   >>> import datetime, pendulum
   >>> pendulum.instance(datetime.datetime.now().replace(tzinfo=datetime.timezone.utc))
   DateTime(2021, 6, 23, 5, 54, 45, 240168, tzinfo=Timezone('+00:00'))
   >>> pendulum.instance(datetime.datetime.now().replace(tzinfo=datetime.timezone(datetime.timedelta(seconds=0))))
   DateTime(2021, 6, 23, 5, 55, 45, 900498, tzinfo=Timezone('+00:00'))
   ```
   
   I’m using pendulum 2.1.2 (as specified in the official constraints).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ecerulm edited a comment on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
ecerulm edited a comment on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-866466511


   > If we coerce the datetime object in `__init__`, we can remove the later timezone type coercion we have right now.
   
   Yes, and you will get rid of InvalidTimezone at `DAG.following_schedule()` and `DAG.previous_schedule()` but it will still raise InvalidTimezone on dag deserialization. So it's actually detrimental to coerce to pendulum. 
   
   Just to clarify my point, in the current (pre-PR) codebase if you provide a pendulum datetime with a non-named pendulum.timezone that will raise InvalidTimezone in `following_schedule`, `previous_schedule` and in deserialization:
   
   ```
   from datetime import datetime,timezone
   import pendulum
   from airflow.serialization import SerializedDAG
   from airflow.models import DAG
   
   start_date = datetime(2021,6,1,tzinfo=timezone.utc) 
   start_date = pendulum.instance(start_date) #  DateTime(2021, 6, 1, 0, 0, 0, tzinfo=Timezone('+00:00'))
   dag = DAG(dag_id='mydag', start_date=start_date, schedule_interval="@hourly")
   serialized_dag = SerializedDAG.to_dict(dag)
   serialized_dag['dag']['timezone'] # '+00:00'
   dag = SerializedDAG.from_dict(serialized_dag) # InvalidTimezone raised here in deserialization
   dag.following_schedule(start_date) # raises InvalidTimezone
   dag.previous_schedule(start_date) # raises InvalidTimezone
   ```
   
   The PR #16599 will get rid of the `InvalidTimezone` in the `following_schedule`, `previous_schedule` but not really help for the deserialization case. You still need to provide a serializable/deserializable datetime as start_date and there are many pendulum datetimes that are not (all the datetimes with non-named timezones like fixed offsets) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ecerulm commented on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
ecerulm commented on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-866466511


   > If we coerce the datetime object in `__init__`, we can remove the later timezone type coercion we have right now.
   
   Yes, and you will get rid of InvalidTimezone at `DAG.following_schedule()` and `DAG.previous_schedule()` but it will still raise InvalidTimezone on dag deserialization. So it's actually detrimental to coerce to pendulum. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-864832581


   I honestly think we should completely get rid of Pendulum. I think it comes from the history where Airflow was also Python 2.7 compliant and Pendulum had many advantages over the standard datetime and friends. Right now with Python 3.6+ I see no reason why we should use Pendulum especially that it seems to be MUCH slower (like all the other non-native date parsing/handling libraries https://aboutsimon.com/blog/2016/08/04/datetime-vs-Arrow-vs-Pendulum-vs-Delorean-vs-udatetime.html). This is from 2016 so things might have changed since (and penduilum does have c extensions it seems), but  I saw so many problems coming from us using it and mixing pendulum with native datetime. 
   
   I remember a problem I spend hours looking for where different behaviour of Pendulum when passed datatime object in the presence or lack of fold parameter (one of the problems which only manifested in Python 3.5). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ecerulm edited a comment on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
ecerulm edited a comment on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-866466511


   > If we coerce the datetime object in `__init__`, we can remove the later timezone type coercion we have right now.
   
   Yes, and you will get rid of InvalidTimezone at `DAG.following_schedule()` and `DAG.previous_schedule()` but it will still raise InvalidTimezone on dag deserialization. So it's actually detrimental to coerce to pendulum. 
   
   Just to clarify my point, in the current (pre-PR) codebase if you provide a pendulum datetime with a non-named pendulum.timezone that will raise InvalidTimezone in `following_schedule`, `previous_schedule` and in deserialization:
   
   ```
   from datetime import datetime,timezone
   import pendulum
   from airflow.serialization import SerializedDAG
   from airflow.models import DAG
   
   start_date = datetime(2021,6,1,tzinfo=timezone.utc) 
   start_date = pendulum.instance(start_date) #  DateTime(2021, 6, 1, 0, 0, 0, tzinfo=Timezone('+00:00'))
   dag = DAG(dag_id='mydag', start_date=start_date, schedule_interval="@hourly")
   serialized_dag = SerializedDAG.to_dict(dag)
   dag = SerializedDAG.from_dict(serialized_dag) # InvalidTimezone raised here in deserialization
   dag.following_schedule(start_date) # raises InvalidTimezone
   dag.previous_schedule(start_date) # raises InvalidTimezone
   ```
   
   The PR #16599 will get rid of the `InvalidTimezone` in the `following_schedule`, `previous_schedule` but not really help for the deserialization case. You still need to provide a serializable/deserializable datetime as start_date and there are many pendulum datetimes that are not (all the datetimes with non-named timezones like fixed offsets) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-864897955


   > I honestly think we should completely get rid of Pendulum.
   
   This will be kinda painful I think since some of the most important macros are Pendulum and it's very common to do manipulation over it so this may require significant code changes per dag when upgrading version


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-864832581


   I honestly think we should completely get rid of Pendulum. I think it comes from the history where Airflow was also Python 2.7 compliant and Pendulum had many advantages over the standard datetime and friends. Right now with Python 3.6+ I see no reason why we should use Pendulum especially that it seems to be MUCH slower (like all the other non-native date parsing/handling libraries https://aboutsimon.com/blog/2016/08/04/datetime-vs-Arrow-vs-Pendulum-vs-Delorean-vs-udatetime.html). This is from 2016 so things might have changed since (and penduilum does have c extensions it seems), but  I saw so many problems coming from us using it and mixing pendulum with native datetime. 
   
   I remember a problem I spend hours working on where different behaviour of Pendulum when passed datatime object in the presence or lack of fold parameter (one of the problems which only manifested in Python 3.5). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #16551: AttributeError: 'datetime.timezone' object has no attribute 'name'

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #16551:
URL: https://github.com/apache/airflow/issues/16551#issuecomment-864832581


   I honestly think we should completely get rid of Pendulum. I think it comes from the history where Airflow was also Python 2.7 compliant and Pendulum had many advantages over the standard datetime and friends. Right now with Python 3.6+ I see no reason why we should use Pendulum especially that it seems to be MUCH slower (like all the other non-native date parsing/handling libraries https://aboutsimon.com/blog/2016/08/04/datetime-vs-Arrow-vs-Pendulum-vs-Delorean-vs-udatetime.html). This is from 2016 so things might have changed since (and penduilum does have c extensions it seems), but  I saw so many problems coming from us using it and mixing pendulum with native datetime. 
   
   I remember a problem I spend hours looking for where different behaviour of Pendulum when passed datatime object in the presence or lack of fold parameter (one of the problems which only manifested in Python 3.5). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org