You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Kaxil Naik (Jira)" <ji...@apache.org> on 2020/05/10 03:26:00 UTC
[jira] [Closed] (AIRFLOW-1056) Single dag run triggered when
un-pausing job with catchup=False
[ https://issues.apache.org/jira/browse/AIRFLOW-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kaxil Naik closed AIRFLOW-1056.
-------------------------------
Resolution: Duplicate
> Single dag run triggered when un-pausing job with catchup=False
> ---------------------------------------------------------------
>
> Key: AIRFLOW-1056
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1056
> Project: Apache Airflow
> Issue Type: Bug
> Components: scheduler
> Affects Versions: 1.8.0
> Reporter: Andrew Heuermann
> Priority: Major
>
> When "catchup=False" a single job run is still triggered when un-pausing a dag when there are missed run windows.
> In airflow/jobs.py:create_dag_run(): When catchup is disabled it updates the dag.start_date here to prevent the backfill: https://github.com/apache/incubator-airflow/blob/bb39078a35cf2bceea58d7831d7a2028c8ef849f/airflow/jobs.py#L770.
> But it looks like the function schedules dags based on a window (using sequential run times as lower and upper bounds) so it will always schedule a single dag run if there is a missed run between the last run and the time which it was unpaused. Even if it was un-paused AFTER those missed runs.
> Some ideas on solutions:
> * Pass in the time when the scheduler last ran and use that as the lower bound of the window, but not sure how easy that is to get to.
> * Update the start_date when a dag with catchup=False is unpaused. Or add a new "unpaused_date" field that would serve the same purpose.
> * If paused have the scheduler insert a skipped Job record when the job would have run.
> There might be a simpler solution I'm missing.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)