You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "uranusjr (via GitHub)" <gi...@apache.org> on 2023/07/17 13:32:15 UTC

[GitHub] [airflow] uranusjr commented on issue #27399: CronTriggerTimetable lost one task occasionally

uranusjr commented on issue #27399:
URL: https://github.com/apache/airflow/issues/27399#issuecomment-1638152940

   I spent some time to take a closer look at the implementation. The problem with
   
   > I think the reason is because CronDataIntervalTimetable runs a last DAG run if it missed even when `catchup=False`
   
   is that CronDataIntervalTimetable actually does not do that! The reason it seems to be more resillient is that catchup in that time table relies on the data interval _start_, not the trigger time. So say
   
   * You have a cron
   * Last run on 3am
   * Current time 4:05
   
   For CronDataIntervalTimetable, since the last run covered 2–3am, the next run should cover 3–4am, which can be achieved without catchup (the further next covers 4–5am and is not due yet). Everything is fine. But for CronTriggerTimetable, the last run covered 3am, but the 4am should be skipped since that’s already in the past.
   
   I think a reasonable logic would be to change the `catchup=False` logic to cover one schedule _before_ the current time instead, so in the above scenario, the timetable would make the next run cover 4am, and only skip the 4am run if the current time is pas 5am.
   
   I can prepare a PR if that sounds reasonable (or anyone can, it’s just one line in `airflow.timetables` and fixing the corresponding test case).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org