You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Ash Berlin-Taylor (JIRA)" <ji...@apache.org> on 2017/11/21 13:06:01 UTC

[jira] [Created] (AIRFLOW-1837) Differing start_dates on tasks not respected by scheduler.

Ash Berlin-Taylor created AIRFLOW-1837:
------------------------------------------

             Summary: Differing start_dates on tasks not respected by scheduler.
                 Key: AIRFLOW-1837
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1837
             Project: Apache Airflow
          Issue Type: Bug
    Affects Versions: 1.9.0
            Reporter: Ash Berlin-Taylor


It it possible to specify start_date directly on tasks in dag, as well as on the DAG. This is correctly handled when creating dag runs, but it is seemingly ignored when scheduling tasks.

Given this example:

{code}
dag_args = {
    "start_date": datetime(2017, 9, 4),
}
dag = DAG(
    "my-dag",
    default_args=dag_args,
    schedule_interval="0 0 * * Mon",
)

# ...
with dag:
        op = PythonOperator(
            python_callable=fetcher.run,
            task_id="fetch_all_respondents",
            provide_context=True,
            # The "unfiltered" API calls are a lot quicker, so lets put them
            # ahead of any other filtered job in the queue.
            priority_weight=10,
            start_date=datetime(2014, 9, 1),
        )

        op = PythonOperator(
            python_callable=fetcher.run,
            task_id="fetch_by_demographics",
            op_kwargs={
                'demo_names': demo_names,
            },
            provide_context=True,
            priority_weight=5,
        )
{code}

I only want the fetch_all_respondents tasks to run for 2014..2017, and then from September 2017 I also want the fetch_by_demographics task to run. However right now both tasks are being scheduled from 2014-09-01.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)