You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/08/20 15:15:23 UTC

[GitHub] [airflow] uranusjr opened a new pull request #17755: Separate infer_data_interval implementations for interval timetables

uranusjr opened a new pull request #17755:
URL: https://github.com/apache/airflow/pull/17755


   CronDataIntervalTimetable and DeltaDataIntervalTimetable needs different infer_data_interval implementations because since the 'align' method aligns the time *forward*, CronDataIntervalTimetable needs to call get_prev one more time than DeltaDataIntervalTimetable to get the correct interval.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #17755: Separate infer_data_interval implementations for interval timetables

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #17755:
URL: https://github.com/apache/airflow/pull/17755#discussion_r693064294



##########
File path: airflow/timetables/interval.py
##########
@@ -86,6 +79,13 @@ class CronDataIntervalTimetable(_DataIntervalTimetable):
     def __init__(self, cron: str, timezone: datetime.tzinfo) -> None:
         self._schedule = CronSchedule(cron, timezone)
 
+    def infer_data_interval(self, run_after: DateTime) -> DataInterval:
+        # Get the last complete period before run_after, e.g. if a DAG run is
+        # scheduled at each midnight, the data interval of a manually triggered
+        # run at 1am 25th is between 0am 24th and 0am 25th.
+        end = self._schedule.get_prev(self._schedule.align(run_after))
+        return DataInterval(start=self._schedule.get_prev(end), end=end)
+
 
 class DeltaDataIntervalTimetable(_DataIntervalTimetable):

Review comment:
       (not related to this PR)
   
   Should we change the name to `DataIntervalWithDeltaTimetable` ? no strong opinion
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #17755: Separate infer_data_interval implementations for interval timetables

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #17755:
URL: https://github.com/apache/airflow/pull/17755#issuecomment-902769337


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr merged pull request #17755: Separate infer_data_interval for data interval timetables

Posted by GitBox <gi...@apache.org>.
uranusjr merged pull request #17755:
URL: https://github.com/apache/airflow/pull/17755


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #17755: Separate infer_data_interval implementations for interval timetables

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #17755:
URL: https://github.com/apache/airflow/pull/17755#discussion_r693075650



##########
File path: airflow/timetables/interval.py
##########
@@ -86,6 +79,13 @@ class CronDataIntervalTimetable(_DataIntervalTimetable):
     def __init__(self, cron: str, timezone: datetime.tzinfo) -> None:
         self._schedule = CronSchedule(cron, timezone)
 
+    def infer_data_interval(self, run_after: DateTime) -> DataInterval:
+        # Get the last complete period before run_after, e.g. if a DAG run is
+        # scheduled at each midnight, the data interval of a manually triggered
+        # run at 1am 25th is between 0am 24th and 0am 25th.
+        end = self._schedule.get_prev(self._schedule.align(run_after))
+        return DataInterval(start=self._schedule.get_prev(end), end=end)
+
 
 class DeltaDataIntervalTimetable(_DataIntervalTimetable):

Review comment:
       No strong opinion from me either, this name is not intended to be used directly by users anyway (they can just use `schedule_interval`). The only “complaint” (a weird term considering I named these classes) I have to this (and `CronDataIntervalTimetable`) is they are so terribly long.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org