You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Pawel Bartoszek (JIRA)" <ji...@apache.org> on 2019/04/24 15:38:00 UTC

[jira] [Created] (AIRFLOW-4404) Improve support of cron-style jobs

Pawel Bartoszek created AIRFLOW-4404:
----------------------------------------

             Summary: Improve support of cron-style jobs
                 Key: AIRFLOW-4404
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4404
             Project: Apache Airflow
          Issue Type: Improvement
          Components: scheduler
    Affects Versions: 1.10.3
            Reporter: Pawel Bartoszek


The cron like jobs are supported by Airflow with one downside: On the the very first job deployment (completely new DAG) an extra DAG run will be created for the latest passed period. 
When DAG is redeployed (dag name stays the same) then DB already contains the latest run and scheduler will work as a genuine cron scheduler. 
 
To better describe what I mean I prepared an example:
 
{code:java}
with DAG(
dag_id="dag",
start_date=datetime(2019, 4, 1),
schedule_interval="0 2 * * *",
default_view="graph",
orientation="TB",
concurrency=1,
max_active_runs=1,
catchup=False) as dag:{code}
 
I deploy 'dag' for the first time and system time is *2019-04-03 3 PM*.
Airflow will create a DAG run with execution date of 2019-04-02 2 AM straight after the deployment. However, when a new version of 'dag' is redeployed the next run will be triggered according to cron expression ie with the deployment done at 2019-04-03 6 PM the next dag run will be at 2019-04-04 2 AM.
 
*The requested change*
FroM cron-style jobs Airflow should work as a unix cron scheduler ie should start very first dag run only after current system time is after next cron expression date so that no extra run is created.
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)