You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Weiwei Zhang <vv...@gmail.com> on 2017/08/09 16:48:40 UTC

Airflow dependency won't change

Hi guys,

I have two tasks in a DAG, t1 and t2. It used to be t2.set_upstream(t1) and
now I want to refactor the logic by setting t1.set_upstream(t2). However,
when I try to run this DAG, it either will try to run two tasks
simultaneously or it will try to run t2 first and towards the end, it will
also run t1 before t2 finishes. I am very confused about this behavior. Am
I missing something here? I am using Airflow 1.8.1.

Thanks,
-Weiwei

Re: Airflow dependency won't change

Posted by Weiwei Zhang <vv...@gmail.com>.
Thanks for the explanation. We do have an airflow web-server that is
persistent so I wonder if that is causing this issue. For this dag
particularly, we don't have a scheduling thingy. It is just for backfill
only.

On Thu, Aug 17, 2017 at 4:00 PM, Boris Tyukin <bo...@boristyukin.com> wrote:

> hopefully someone chimes in and explain. I just remember I've read to
> change the dag name if schedule needs to be changed that's why I suggested
> it. I've tried to swap order of tasks but I often add/delete tasks from
> dags as I develop and that seems to be working fine without dag renaming.
>
> Conceptually while it is okay to generate DAGs dynamically, it is not okay
> to change DAG from time to time - tasks should be pretty much static as Max
> explained in another thread.
>
> On Thu, Aug 17, 2017 at 6:11 PM, Weiwei Zhang <vv...@gmail.com>
> wrote:
>
> > - Boris, You are right. After I change the dag id to something else, the
> > dependency holds. I am very curious but why i cannot just switch the
> order
> > and don't need to change the dag id. Thanks a lot!
> >
> >
> >
> > On Wed, Aug 9, 2017 at 10:21 AM, Boris Tyukin <bo...@boristyukin.com>
> > wrote:
> >
> > > Hit refresh button from UI to make sure it shows the proper order
> before
> > > you run. you might also try to restart scheduler.
> > >
> > > if it does not help, try to rename your dag_id to something like
> > mydag_v2.
> > >
> > > On Wed, Aug 9, 2017 at 12:48 PM, Weiwei Zhang <vv...@gmail.com>
> > > wrote:
> > >
> > > > Hi guys,
> > > >
> > > > I have two tasks in a DAG, t1 and t2. It used to be
> t2.set_upstream(t1)
> > > and
> > > > now I want to refactor the logic by setting t1.set_upstream(t2).
> > However,
> > > > when I try to run this DAG, it either will try to run two tasks
> > > > simultaneously or it will try to run t2 first and towards the end, it
> > > will
> > > > also run t1 before t2 finishes. I am very confused about this
> behavior.
> > > Am
> > > > I missing something here? I am using Airflow 1.8.1.
> > > >
> > > > Thanks,
> > > > -Weiwei
> > > >
> > >
> >
>

Re: Airflow dependency won't change

Posted by Boris Tyukin <bo...@boristyukin.com>.
hopefully someone chimes in and explain. I just remember I've read to
change the dag name if schedule needs to be changed that's why I suggested
it. I've tried to swap order of tasks but I often add/delete tasks from
dags as I develop and that seems to be working fine without dag renaming.

Conceptually while it is okay to generate DAGs dynamically, it is not okay
to change DAG from time to time - tasks should be pretty much static as Max
explained in another thread.

On Thu, Aug 17, 2017 at 6:11 PM, Weiwei Zhang <vv...@gmail.com> wrote:

> - Boris, You are right. After I change the dag id to something else, the
> dependency holds. I am very curious but why i cannot just switch the order
> and don't need to change the dag id. Thanks a lot!
>
>
>
> On Wed, Aug 9, 2017 at 10:21 AM, Boris Tyukin <bo...@boristyukin.com>
> wrote:
>
> > Hit refresh button from UI to make sure it shows the proper order before
> > you run. you might also try to restart scheduler.
> >
> > if it does not help, try to rename your dag_id to something like
> mydag_v2.
> >
> > On Wed, Aug 9, 2017 at 12:48 PM, Weiwei Zhang <vv...@gmail.com>
> > wrote:
> >
> > > Hi guys,
> > >
> > > I have two tasks in a DAG, t1 and t2. It used to be t2.set_upstream(t1)
> > and
> > > now I want to refactor the logic by setting t1.set_upstream(t2).
> However,
> > > when I try to run this DAG, it either will try to run two tasks
> > > simultaneously or it will try to run t2 first and towards the end, it
> > will
> > > also run t1 before t2 finishes. I am very confused about this behavior.
> > Am
> > > I missing something here? I am using Airflow 1.8.1.
> > >
> > > Thanks,
> > > -Weiwei
> > >
> >
>

Re: Airflow dependency won't change

Posted by Weiwei Zhang <vv...@gmail.com>.
- Boris, You are right. After I change the dag id to something else, the
dependency holds. I am very curious but why i cannot just switch the order
and don't need to change the dag id. Thanks a lot!



On Wed, Aug 9, 2017 at 10:21 AM, Boris Tyukin <bo...@boristyukin.com> wrote:

> Hit refresh button from UI to make sure it shows the proper order before
> you run. you might also try to restart scheduler.
>
> if it does not help, try to rename your dag_id to something like mydag_v2.
>
> On Wed, Aug 9, 2017 at 12:48 PM, Weiwei Zhang <vv...@gmail.com>
> wrote:
>
> > Hi guys,
> >
> > I have two tasks in a DAG, t1 and t2. It used to be t2.set_upstream(t1)
> and
> > now I want to refactor the logic by setting t1.set_upstream(t2). However,
> > when I try to run this DAG, it either will try to run two tasks
> > simultaneously or it will try to run t2 first and towards the end, it
> will
> > also run t1 before t2 finishes. I am very confused about this behavior.
> Am
> > I missing something here? I am using Airflow 1.8.1.
> >
> > Thanks,
> > -Weiwei
> >
>

Re: Airflow dependency won't change

Posted by Boris Tyukin <bo...@boristyukin.com>.
Hit refresh button from UI to make sure it shows the proper order before
you run. you might also try to restart scheduler.

if it does not help, try to rename your dag_id to something like mydag_v2.

On Wed, Aug 9, 2017 at 12:48 PM, Weiwei Zhang <vv...@gmail.com> wrote:

> Hi guys,
>
> I have two tasks in a DAG, t1 and t2. It used to be t2.set_upstream(t1) and
> now I want to refactor the logic by setting t1.set_upstream(t2). However,
> when I try to run this DAG, it either will try to run two tasks
> simultaneously or it will try to run t2 first and towards the end, it will
> also run t1 before t2 finishes. I am very confused about this behavior. Am
> I missing something here? I am using Airflow 1.8.1.
>
> Thanks,
> -Weiwei
>