You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Song Liu <so...@outlook.com> on 2018/05/11 12:26:21 UTC

答复: How to know the DAG is starting to run

Yes, I have though this approach, but more elegant way is doing in the DAG since we don't want to add this "pipeline environment setup" as a single operator, which should be done in the DAG more gracefully.
________________________________
发件人: James Meickle <jm...@quantopian.com>
发送时间: 2018年5月11日 12:09
收件人: dev@airflow.incubator.apache.org
主题: Re: How to know the DAG is starting to run

Song:

You can put an operator as the very first node in the DAG, and have
everything else in the DAG depend on it. For example, this is the approach
we use to only execute DAG tasks on stock market trading days.

-James M.

On Fri, May 11, 2018 at 3:57 AM, Song Liu <so...@outlook.com> wrote:

> Hi,
>
> I have something just want to be done only once when DAG is constructed,
> but it seems that DAG will be instanced every time when run each of
> operator.
>
> So is that there function in DAG that tell us it is starting to run now ?
>
> Thanks,
> Song
>

答复: 答复: How to know the DAG is starting to run

Posted by Song Liu <so...@outlook.com>.
Yes, I want to know the event about the creation of a DagRun.
________________________________
发件人: crispy16@gmail.com <cr...@gmail.com> 代表 Chris Palmer <ch...@crpalmer.com>
发送时间: 2018年5月11日 15:46
收件人: dev@airflow.incubator.apache.org
主题: Re: 答复: How to know the DAG is starting to run

It's not even clear to me what it means for a DAG to start running. The
creation of a DagRun for a specific execution date is completely
independent of the scheduling of any TaskInstances for that DagRun. There
could be a significant delay between those two events, either deliberately
encoded into the DAG or due to resource constraints.

What event are you actually interested in knowing about? The creation of a
DagRun? The starting of any task for a DagRun? Something else?

Maybe if you provided more details on what exactly the "pipeline
environment setup" you are trying to do, it would help others understand
the problem you are trying to solve.

Chris

On Fri, May 11, 2018 at 10:59 AM, Song Liu <so...@outlook.com> wrote:

> Overriding the "DAG.run" sounds like a workaround, so that if it's running
> a first operation of DAG then do some setup etc.
>
> ________________________________
> 发件人: Victor Noagbodji <vn...@amplify-nation.com>
> 发送时间: 2018年5月11日 12:50
> 收件人: dev@airflow.incubator.apache.org
> 主题: Re: How to know the DAG is starting to run
>
> Hey,
>
> I don't know if airflow has a concept of DAG-level events or callbacks.
> (Operators do have callbacks though.). You might get away with subclassing
> the DAG class or having a class decorator.
>
> The source suggests that ".run()" is the method you want to override. You
> may want to call the original "super().run()" then do what you need to do
> afterwards.
>
> Let's see if that works for you.
>
> > On May 11, 2018, at 8:26 AM, Song Liu <so...@outlook.com> wrote:
> >
> > Yes, I have though this approach, but more elegant way is doing in the
> DAG since we don't want to add this "pipeline environment setup" as a
> single operator, which should be done in the DAG more gracefully.
> > ________________________________
> > 发件人: James Meickle <jm...@quantopian.com>
> > 发送时间: 2018年5月11日 12:09
> > 收件人: dev@airflow.incubator.apache.org
> > 主题: Re: How to know the DAG is starting to run
> >
> > Song:
> >
> > You can put an operator as the very first node in the DAG, and have
> > everything else in the DAG depend on it. For example, this is the
> approach
> > we use to only execute DAG tasks on stock market trading days.
> >
> > -James M.
> >
> > On Fri, May 11, 2018 at 3:57 AM, Song Liu <so...@outlook.com> wrote:
> >
> >> Hi,
> >>
> >> I have something just want to be done only once when DAG is constructed,
> >> but it seems that DAG will be instanced every time when run each of
> >> operator.
> >>
> >> So is that there function in DAG that tell us it is starting to run now
> ?
> >>
> >> Thanks,
> >> Song
> >>
>
>

Re: 答复: How to know the DAG is starting to run

Posted by Chris Palmer <ch...@crpalmer.com>.
It's not even clear to me what it means for a DAG to start running. The
creation of a DagRun for a specific execution date is completely
independent of the scheduling of any TaskInstances for that DagRun. There
could be a significant delay between those two events, either deliberately
encoded into the DAG or due to resource constraints.

What event are you actually interested in knowing about? The creation of a
DagRun? The starting of any task for a DagRun? Something else?

Maybe if you provided more details on what exactly the "pipeline
environment setup" you are trying to do, it would help others understand
the problem you are trying to solve.

Chris

On Fri, May 11, 2018 at 10:59 AM, Song Liu <so...@outlook.com> wrote:

> Overriding the "DAG.run" sounds like a workaround, so that if it's running
> a first operation of DAG then do some setup etc.
>
> ________________________________
> 发件人: Victor Noagbodji <vn...@amplify-nation.com>
> 发送时间: 2018年5月11日 12:50
> 收件人: dev@airflow.incubator.apache.org
> 主题: Re: How to know the DAG is starting to run
>
> Hey,
>
> I don't know if airflow has a concept of DAG-level events or callbacks.
> (Operators do have callbacks though.). You might get away with subclassing
> the DAG class or having a class decorator.
>
> The source suggests that ".run()" is the method you want to override. You
> may want to call the original "super().run()" then do what you need to do
> afterwards.
>
> Let's see if that works for you.
>
> > On May 11, 2018, at 8:26 AM, Song Liu <so...@outlook.com> wrote:
> >
> > Yes, I have though this approach, but more elegant way is doing in the
> DAG since we don't want to add this "pipeline environment setup" as a
> single operator, which should be done in the DAG more gracefully.
> > ________________________________
> > 发件人: James Meickle <jm...@quantopian.com>
> > 发送时间: 2018年5月11日 12:09
> > 收件人: dev@airflow.incubator.apache.org
> > 主题: Re: How to know the DAG is starting to run
> >
> > Song:
> >
> > You can put an operator as the very first node in the DAG, and have
> > everything else in the DAG depend on it. For example, this is the
> approach
> > we use to only execute DAG tasks on stock market trading days.
> >
> > -James M.
> >
> > On Fri, May 11, 2018 at 3:57 AM, Song Liu <so...@outlook.com> wrote:
> >
> >> Hi,
> >>
> >> I have something just want to be done only once when DAG is constructed,
> >> but it seems that DAG will be instanced every time when run each of
> >> operator.
> >>
> >> So is that there function in DAG that tell us it is starting to run now
> ?
> >>
> >> Thanks,
> >> Song
> >>
>
>

答复: How to know the DAG is starting to run

Posted by Song Liu <so...@outlook.com>.
Overriding the "DAG.run" sounds like a workaround, so that if it's running a first operation of DAG then do some setup etc.

________________________________
发件人: Victor Noagbodji <vn...@amplify-nation.com>
发送时间: 2018年5月11日 12:50
收件人: dev@airflow.incubator.apache.org
主题: Re: How to know the DAG is starting to run

Hey,

I don't know if airflow has a concept of DAG-level events or callbacks. (Operators do have callbacks though.). You might get away with subclassing the DAG class or having a class decorator.

The source suggests that ".run()" is the method you want to override. You may want to call the original "super().run()" then do what you need to do afterwards.

Let's see if that works for you.

> On May 11, 2018, at 8:26 AM, Song Liu <so...@outlook.com> wrote:
>
> Yes, I have though this approach, but more elegant way is doing in the DAG since we don't want to add this "pipeline environment setup" as a single operator, which should be done in the DAG more gracefully.
> ________________________________
> 发件人: James Meickle <jm...@quantopian.com>
> 发送时间: 2018年5月11日 12:09
> 收件人: dev@airflow.incubator.apache.org
> 主题: Re: How to know the DAG is starting to run
>
> Song:
>
> You can put an operator as the very first node in the DAG, and have
> everything else in the DAG depend on it. For example, this is the approach
> we use to only execute DAG tasks on stock market trading days.
>
> -James M.
>
> On Fri, May 11, 2018 at 3:57 AM, Song Liu <so...@outlook.com> wrote:
>
>> Hi,
>>
>> I have something just want to be done only once when DAG is constructed,
>> but it seems that DAG will be instanced every time when run each of
>> operator.
>>
>> So is that there function in DAG that tell us it is starting to run now ?
>>
>> Thanks,
>> Song
>>


Re: How to know the DAG is starting to run

Posted by Victor Noagbodji <vn...@amplify-nation.com>.
Hey,

I don't know if airflow has a concept of DAG-level events or callbacks. (Operators do have callbacks though.). You might get away with subclassing the DAG class or having a class decorator.

The source suggests that ".run()" is the method you want to override. You may want to call the original "super().run()" then do what you need to do afterwards.

Let's see if that works for you.

> On May 11, 2018, at 8:26 AM, Song Liu <so...@outlook.com> wrote:
> 
> Yes, I have though this approach, but more elegant way is doing in the DAG since we don't want to add this "pipeline environment setup" as a single operator, which should be done in the DAG more gracefully.
> ________________________________
> 发件人: James Meickle <jm...@quantopian.com>
> 发送时间: 2018年5月11日 12:09
> 收件人: dev@airflow.incubator.apache.org
> 主题: Re: How to know the DAG is starting to run
> 
> Song:
> 
> You can put an operator as the very first node in the DAG, and have
> everything else in the DAG depend on it. For example, this is the approach
> we use to only execute DAG tasks on stock market trading days.
> 
> -James M.
> 
> On Fri, May 11, 2018 at 3:57 AM, Song Liu <so...@outlook.com> wrote:
> 
>> Hi,
>> 
>> I have something just want to be done only once when DAG is constructed,
>> but it seems that DAG will be instanced every time when run each of
>> operator.
>> 
>> So is that there function in DAG that tell us it is starting to run now ?
>> 
>> Thanks,
>> Song
>>