You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/02/15 05:37:17 UTC

[GitHub] [airflow] sushinoya opened a new issue #21570: Timetables: logical_date (deprecated execution_date) does not align with data_interval.start for manual DAG trigger

sushinoya opened a new issue #21570:
URL: https://github.com/apache/airflow/issues/21570


   ### Apache Airflow version
   
   2.2.3 (latest released)
   
   ### What happened
   
   I created a simple DAG using the `CronDataIntervalTimetable`. I used the Trigger DAG button on the UI to trigger a DAG Run. The cron expression that I was using was `5,10,15,20,25  * * * *`  and I pressed the button at `13:18`.
   
   I expected the Data Interval to be `13.10` to `13.15` and that was represented correctly on the UI -
   
   <img width="300" alt="Screenshot 2022-02-15 at 1 21 29 PM" src="https://user-images.githubusercontent.com/23443586/153997693-f5d44c35-51ac-4124-8a96-7125f57879c1.png">
   
   I also expected the `execution_date` a.k.a `logical_date` to be equal to `data_interval.start` i.e `13.10`. However it was `13.18` instead (or `05:18` in UTC as shown in the image).
   
   <img width="600" alt="Screenshot 2022-02-15 at 1 23 58 PM" src="https://user-images.githubusercontent.com/23443586/153998009-acffcbca-127a-4f31-ab06-2e3f8688574e.png">
   
   The same can be seen from the Task Instance Details tab of the `BashOperator`.
   
   <img width="500" alt="Screenshot 2022-02-15 at 1 25 26 PM" src="https://user-images.githubusercontent.com/23443586/153998072-c6518232-109a-4d10-b873-8423a89e2b9e.png">
   
   
   
   ### What you expected to happen
   
   As explained above, I expected the `execution_date` to be equal to the `data_interval.start`. In fact, for timetables this is how it is defined - https://github.com/apache/airflow/blob/0cd3b11f3a5c406fbbd4433d8e44d326086db634/airflow/timetables/base.py#L93-L100
   
   Thus it seems rather odd or it to behave differently in this case. We do know the data_interval from the Timetable's `infer_manual_data_interval` function and it is reflected correctly on the UI too. However, the execution is not update accordingly.
   
   ### How to reproduce
   
   Add a file `cron_data_interval_timetable_test.py` with the following contents to the dags folder. Update the timezone to your local timezone for convenience - 
   
   ```python
   import datetime
   
   from airflow import DAG
   from airflow.operators.dummy import DummyOperator
   from airflow.timetables.interval import CronDataIntervalTimetable
   from pendulum.tz.timezone import Timezone
   
   with DAG(
       dag_id="cron_data_interval_timetable_test",
       start_date=datetime.datetime(2021, 1, 1),
       timetable=CronDataIntervalTimetable("45,50,55  * * * *", Timezone("Asia/Singapore")),
       tags=["example", "timetable"],
       catchup=False
   ) as dag:
       BashOperator(
       task_id="print_day_of_week",
       bash_command="echo Execution Date is {{ execution_date }}",
   )
   ```
   
   Enable this DAG on the UI and trigger a manual run using the play button on the top right. Then look into the DAG run's `print_day_of_week` tasks's Rendered Template and Task Instance Details. Both mention execution date.
   
   ### Operating System
   
   MacOS Big Sur (11.6.1)
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Docker-Compose
   
   ### Deployment details
   
   - `curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.2.3/docker-compose.yaml'`
   - `docker compose up`
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #21570: Timetables: logical_date (deprecated execution_date) does not align with data_interval.start for manual DAG trigger

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #21570:
URL: https://github.com/apache/airflow/issues/21570#issuecomment-1039878764


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr closed issue #21570: Timetables: logical_date (deprecated execution_date) does not align with data_interval.start for manual DAG trigger

Posted by GitBox <gi...@apache.org>.
uranusjr closed issue #21570:
URL: https://github.com/apache/airflow/issues/21570


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #21570: Timetables: logical_date (deprecated execution_date) does not align with data_interval.start for manual DAG trigger

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #21570:
URL: https://github.com/apache/airflow/issues/21570#issuecomment-1039999015


   This is expected. Unfortunately execution_date pointed to the _run date_ before timetable was introduced as a concept, and we had to keep backward compatibility. The only viable alternative would be to keep execution_date point to the run date, but have logical date match data_interval.start, but that is arguably even more confusing. Ultimately logical date (and execution_date) does not really mean anything semantically, but only identifies the given DAG run. It was so, is so, and will be so (until Airflow 3.0 when we can break compatibility, but at that point I'd hope logical date cease to exist entirely). I am sorry, but it is what it is.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #21570: Timetables: logical_date (deprecated execution_date) does not align with data_interval.start for manual DAG trigger

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #21570:
URL: https://github.com/apache/airflow/issues/21570#issuecomment-1039878764


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr closed issue #21570: Timetables: logical_date (deprecated execution_date) does not align with data_interval.start for manual DAG trigger

Posted by GitBox <gi...@apache.org>.
uranusjr closed issue #21570:
URL: https://github.com/apache/airflow/issues/21570


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #21570: Timetables: logical_date (deprecated execution_date) does not align with data_interval.start for manual DAG trigger

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #21570:
URL: https://github.com/apache/airflow/issues/21570#issuecomment-1039999015


   This is expected. Unfortunately execution_date pointed to the _run date_ before timetable was introduced as a concept, and we had to keep backward compatibility. The only viable alternative would be to keep execution_date point to the run date, but have logical date match data_interval.start, but that is arguably even more confusing. Ultimately logical date (and execution_date) does not really mean anything semantically, but only identifies the given DAG run. It was so, is so, and will be so (until Airflow 3.0 when we can break compatibility, but at that point I'd hope logical date cease to exist entirely). I am sorry, but it is what it is.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org