You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Sumit Maheshwari <su...@gmail.com> on 2017/01/14 11:43:01 UTC

Cron expression with depends_on_past causing no TIs to be generated

Found that on our current prod version (1.7.0) and master as well,
scheduler doesn't fire TIs if *scheduler_interval* is a cron expression and
*depends_on_past* is True.

The very simple DAG I used for testing can be found here:
http://pastebin.com/wPnFjRwD

However it creates new DAG runs as per the scheduler, see this:
https://drive.google.com/file/d/0B9EjxDCEDhERLWNRT196OWgtcnM/view?usp=sharing

Scheduler logs for the DAG can be seen here: http://pastebin.com/z5aRxh76

Please let me know if this is a know issue, or there are any workaround.
What we do for workaround is to set *depends_on_past* to *False* at first
then set it to *True* later.

Thanks,
Sumit

Re: Cron expression with depends_on_past causing no TIs to be generated

Posted by Bolke de Bruin <bd...@gmail.com>.
Np. Can you please try the patch and +1 it? I would like to get it into 1.8.

Thx
Bolke.
> On 14 Jan 2017, at 19:04, Sumit Maheshwari <su...@gmail.com> wrote:
> 
> Thanks a lot Bolke for looking into this issue and raising PR in such a short time.
> 
> 
> 
> On Sat, Jan 14, 2017 at 8:52 PM, Bolke de Bruin <bdbruin@gmail.com <ma...@gmail.com>> wrote:
> PR is out. Should fix your issue and fix a bug that was introduced with the catchup feature.
> 
> https://github.com/apache/incubator-airflow/pull/1994 <https://github.com/apache/incubator-airflow/pull/1994> <https://github.com/apache/incubator-airflow/pull/1994 <https://github.com/apache/incubator-airflow/pull/1994>>
> 
> Bolke
> 
> > On 14 Jan 2017, at 14:22, Bolke de Bruin <bdbruin@gmail.com <ma...@gmail.com>> wrote:
> >
> > You will need to align your start_date with the cron interval. In this case your start date should include 30 minutes. This is expected behaviour in 1.7.0. and 1.7.1. In master we are auto aligning the start_date with the interval in the scheduler. However, the dependency checker doesn’t do this so there it is an issue. I have create AIRFLOW-759 to track this issue.
> >
> > Bolke.
> >
> >
> >> On 14 Jan 2017, at 12:43, Sumit Maheshwari <sumeet.manit@gmail.com <ma...@gmail.com>> wrote:
> >>
> >> Found that on our current prod version (1.7.0) and master as well,
> >> scheduler doesn't fire TIs if *scheduler_interval* is a cron expression and
> >> *depends_on_past* is True.
> >>
> >> The very simple DAG I used for testing can be found here:
> >> http://pastebin.com/wPnFjRwD <http://pastebin.com/wPnFjRwD>
> >>
> >> However it creates new DAG runs as per the scheduler, see this:
> >> https://drive.google.com/file/d/0B9EjxDCEDhERLWNRT196OWgtcnM/view?usp=sharing <https://drive.google.com/file/d/0B9EjxDCEDhERLWNRT196OWgtcnM/view?usp=sharing>
> >>
> >> Scheduler logs for the DAG can be seen here: http://pastebin.com/z5aRxh76 <http://pastebin.com/z5aRxh76>
> >>
> >> Please let me know if this is a know issue, or there are any workaround.
> >> What we do for workaround is to set *depends_on_past* to *False* at first
> >> then set it to *True* later.
> >>
> >> Thanks,
> >> Sumit
> >
> 
> 


Re: Cron expression with depends_on_past causing no TIs to be generated

Posted by Sumit Maheshwari <su...@gmail.com>.
Thanks a lot Bolke for looking into this issue and raising PR in such a
short time.



On Sat, Jan 14, 2017 at 8:52 PM, Bolke de Bruin <bd...@gmail.com> wrote:

> PR is out. Should fix your issue and fix a bug that was introduced with
> the catchup feature.
>
> https://github.com/apache/incubator-airflow/pull/1994 <
> https://github.com/apache/incubator-airflow/pull/1994>
>
> Bolke
>
> > On 14 Jan 2017, at 14:22, Bolke de Bruin <bd...@gmail.com> wrote:
> >
> > You will need to align your start_date with the cron interval. In this
> case your start date should include 30 minutes. This is expected behaviour
> in 1.7.0. and 1.7.1. In master we are auto aligning the start_date with the
> interval in the scheduler. However, the dependency checker doesn’t do this
> so there it is an issue. I have create AIRFLOW-759 to track this issue.
> >
> > Bolke.
> >
> >
> >> On 14 Jan 2017, at 12:43, Sumit Maheshwari <su...@gmail.com>
> wrote:
> >>
> >> Found that on our current prod version (1.7.0) and master as well,
> >> scheduler doesn't fire TIs if *scheduler_interval* is a cron expression
> and
> >> *depends_on_past* is True.
> >>
> >> The very simple DAG I used for testing can be found here:
> >> http://pastebin.com/wPnFjRwD
> >>
> >> However it creates new DAG runs as per the scheduler, see this:
> >> https://drive.google.com/file/d/0B9EjxDCEDhERLWNRT196OWgtcnM/
> view?usp=sharing
> >>
> >> Scheduler logs for the DAG can be seen here:
> http://pastebin.com/z5aRxh76
> >>
> >> Please let me know if this is a know issue, or there are any workaround.
> >> What we do for workaround is to set *depends_on_past* to *False* at
> first
> >> then set it to *True* later.
> >>
> >> Thanks,
> >> Sumit
> >
>
>

Re: Cron expression with depends_on_past causing no TIs to be generated

Posted by Bolke de Bruin <bd...@gmail.com>.
PR is out. Should fix your issue and fix a bug that was introduced with the catchup feature.

https://github.com/apache/incubator-airflow/pull/1994 <https://github.com/apache/incubator-airflow/pull/1994>

Bolke

> On 14 Jan 2017, at 14:22, Bolke de Bruin <bd...@gmail.com> wrote:
> 
> You will need to align your start_date with the cron interval. In this case your start date should include 30 minutes. This is expected behaviour in 1.7.0. and 1.7.1. In master we are auto aligning the start_date with the interval in the scheduler. However, the dependency checker doesn’t do this so there it is an issue. I have create AIRFLOW-759 to track this issue.
> 
> Bolke.
> 
> 
>> On 14 Jan 2017, at 12:43, Sumit Maheshwari <su...@gmail.com> wrote:
>> 
>> Found that on our current prod version (1.7.0) and master as well,
>> scheduler doesn't fire TIs if *scheduler_interval* is a cron expression and
>> *depends_on_past* is True.
>> 
>> The very simple DAG I used for testing can be found here:
>> http://pastebin.com/wPnFjRwD
>> 
>> However it creates new DAG runs as per the scheduler, see this:
>> https://drive.google.com/file/d/0B9EjxDCEDhERLWNRT196OWgtcnM/view?usp=sharing
>> 
>> Scheduler logs for the DAG can be seen here: http://pastebin.com/z5aRxh76
>> 
>> Please let me know if this is a know issue, or there are any workaround.
>> What we do for workaround is to set *depends_on_past* to *False* at first
>> then set it to *True* later.
>> 
>> Thanks,
>> Sumit
> 


Re: Cron expression with depends_on_past causing no TIs to be generated

Posted by Bolke de Bruin <bd...@gmail.com>.
You will need to align your start_date with the cron interval. In this case your start date should include 30 minutes. This is expected behaviour in 1.7.0. and 1.7.1. In master we are auto aligning the start_date with the interval in the scheduler. However, the dependency checker doesn’t do this so there it is an issue. I have create AIRFLOW-759 to track this issue.

Bolke.


> On 14 Jan 2017, at 12:43, Sumit Maheshwari <su...@gmail.com> wrote:
> 
> Found that on our current prod version (1.7.0) and master as well,
> scheduler doesn't fire TIs if *scheduler_interval* is a cron expression and
> *depends_on_past* is True.
> 
> The very simple DAG I used for testing can be found here:
> http://pastebin.com/wPnFjRwD
> 
> However it creates new DAG runs as per the scheduler, see this:
> https://drive.google.com/file/d/0B9EjxDCEDhERLWNRT196OWgtcnM/view?usp=sharing
> 
> Scheduler logs for the DAG can be seen here: http://pastebin.com/z5aRxh76
> 
> Please let me know if this is a know issue, or there are any workaround.
> What we do for workaround is to set *depends_on_past* to *False* at first
> then set it to *True* later.
> 
> Thanks,
> Sumit