You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Lance Norskog <la...@gmail.com> on 2016/05/28 01:22:35 UTC

"intra-day" backfill mistake

I just made the same mistake, twice.

I have a Dag with this schedule:

dag = DAG(
    dag_id='xyz', default_args=args,
    schedule_interval='0 2 * * *',
    start_date=datetime(2016, 5, 16,2),
    max_active_runs=1
    )


It runs daily, at 2AM UTC. I ran this command:
airflow backfill -s 2016-05-24 xyz

So, of course, it created a new run at midnight UTC instead of running the
DAG at 2am.

Should 'airflow backfill' and similar respect the periodicity of the DAG?
Could these command give an error and require a flag to force running
outside the DAG's periodicity?

-- 
Lance Norskog
lance.norskog@gmail.com
Redwood City, CA

Re: "intra-day" backfill mistake

Posted by Lance Norskog <la...@gmail.com>.
In general, since a DAG declares its own periodicity, why should any
reference to the DAG need to also be precise?
Shouldn't an incomplete timestamp refer to "the next time DAG happens"?


On Tue, May 31, 2016 at 9:05 AM, Chris Riccomini <cr...@apache.org>
wrote:

> I agree. The date format handling in the CLI is not intuitive. IMO, it
> should force full-resolution YYYYMMDDHHMMSS.micro.
>
> On Fri, May 27, 2016 at 7:28 PM, Bence Nagy <be...@underyx.me> wrote:
>
> > I can think of weird legit reasons why one would force running the dag on
> > an intra-day start date, but a warning message requiring confirmation
> > (perhaps also offering to autocorrect the date to the nearest schedule
> > matching one) would be awesome to have.
> >
> > On Sat, May 28, 2016, 3:22 AM Lance Norskog <la...@gmail.com>
> > wrote:
> >
> > > I just made the same mistake, twice.
> > >
> > > I have a Dag with this schedule:
> > >
> > > dag = DAG(
> > >     dag_id='xyz', default_args=args,
> > >     schedule_interval='0 2 * * *',
> > >     start_date=datetime(2016, 5, 16,2),
> > >     max_active_runs=1
> > >     )
> > >
> > >
> > > It runs daily, at 2AM UTC. I ran this command:
> > > airflow backfill -s 2016-05-24 xyz
> > >
> > > So, of course, it created a new run at midnight UTC instead of running
> > the
> > > DAG at 2am.
> > >
> > > Should 'airflow backfill' and similar respect the periodicity of the
> DAG?
> > > Could these command give an error and require a flag to force running
> > > outside the DAG's periodicity?
> > >
> > > --
> > > Lance Norskog
> > > lance.norskog@gmail.com
> > > Redwood City, CA
> > >
> >
>



-- 
Lance Norskog
lance.norskog@gmail.com
Redwood City, CA

Re: "intra-day" backfill mistake

Posted by Chris Riccomini <cr...@apache.org>.
I agree. The date format handling in the CLI is not intuitive. IMO, it
should force full-resolution YYYYMMDDHHMMSS.micro.

On Fri, May 27, 2016 at 7:28 PM, Bence Nagy <be...@underyx.me> wrote:

> I can think of weird legit reasons why one would force running the dag on
> an intra-day start date, but a warning message requiring confirmation
> (perhaps also offering to autocorrect the date to the nearest schedule
> matching one) would be awesome to have.
>
> On Sat, May 28, 2016, 3:22 AM Lance Norskog <la...@gmail.com>
> wrote:
>
> > I just made the same mistake, twice.
> >
> > I have a Dag with this schedule:
> >
> > dag = DAG(
> >     dag_id='xyz', default_args=args,
> >     schedule_interval='0 2 * * *',
> >     start_date=datetime(2016, 5, 16,2),
> >     max_active_runs=1
> >     )
> >
> >
> > It runs daily, at 2AM UTC. I ran this command:
> > airflow backfill -s 2016-05-24 xyz
> >
> > So, of course, it created a new run at midnight UTC instead of running
> the
> > DAG at 2am.
> >
> > Should 'airflow backfill' and similar respect the periodicity of the DAG?
> > Could these command give an error and require a flag to force running
> > outside the DAG's periodicity?
> >
> > --
> > Lance Norskog
> > lance.norskog@gmail.com
> > Redwood City, CA
> >
>

Re: "intra-day" backfill mistake

Posted by Bence Nagy <be...@underyx.me>.
I can think of weird legit reasons why one would force running the dag on
an intra-day start date, but a warning message requiring confirmation
(perhaps also offering to autocorrect the date to the nearest schedule
matching one) would be awesome to have.

On Sat, May 28, 2016, 3:22 AM Lance Norskog <la...@gmail.com> wrote:

> I just made the same mistake, twice.
>
> I have a Dag with this schedule:
>
> dag = DAG(
>     dag_id='xyz', default_args=args,
>     schedule_interval='0 2 * * *',
>     start_date=datetime(2016, 5, 16,2),
>     max_active_runs=1
>     )
>
>
> It runs daily, at 2AM UTC. I ran this command:
> airflow backfill -s 2016-05-24 xyz
>
> So, of course, it created a new run at midnight UTC instead of running the
> DAG at 2am.
>
> Should 'airflow backfill' and similar respect the periodicity of the DAG?
> Could these command give an error and require a flag to force running
> outside the DAG's periodicity?
>
> --
> Lance Norskog
> lance.norskog@gmail.com
> Redwood City, CA
>