You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Vishesh Jain <vi...@twitter.com.INVALID> on 2019/12/06 09:37:40 UTC

[DISCUSS] Approach for JIRA AIRFLOW-5523

Hi Team

I am working on JIRA AIRFLOW-5523
<https://issues.apache.org/jira/browse/AIRFLOW-5523>.

As per the JIRA, the requirement is to automate the deletion for the dags
whose dag file is not present.

In the current state, when a user tries to delete a dag from UI whose file
is already present, it fails with the following message:

[image: Screenshot 2019-12-05 at 2.49.03 PM.png]

Once this dag file is removed, it becomes a "zombie" dag, as shown below.
[image: image.png]

In this state, this dag can be deleted from the UI and asks for the
confirmation with warning as:
[image: Screenshot 2019-12-05 at 2.55.25 PM.png]

One way to remove these zombie dags is to filter these entries from
populating to the UI, ie, entries for which the dag file is not present.
Also, after making this change, the cross button([image: Screenshot
2019-12-05 at 2.58.30 PM.png]) will also won't make any sense, which can
also be removed.

Now if the user deletes the dag file, he won't see those dag entries
anymore in the UI.

Kindly provide your feedback or concerns.

Thanks ^ Regards
Vishesh Jain

Re: [DISCUSS] Approach for JIRA AIRFLOW-5523

Posted by Vishesh Jain <vi...@twitter.com.INVALID>.
I tried restarting the webserver also, but getting the same issue.

Also checked the code for case-1b, it shows the updated code with new dagId.

Found one more issue:
When I update a file and make it unparseable (broken dag). Restart
scheduler and webserver. I am able to see Broken Dag entry, along with dag
table entry.
Here is the image <https://pasteboard.co/IKzdu1I.png>
When I click in this dag entry, it also shows the updated code. And if I
try to trigger the instance, it breaks with the following error:

  File "/Users/visheshj/workspace/temp/airflow/airflow/models/dag.py",
line 1641, in create_dagrun
    return self.get_dag().create_dagrun(run_id=run_id,
AttributeError: 'NoneType' object has no attribute 'create_dagrun'


Thanks
Vishesh Jain


On Tue, Dec 10, 2019 at 7:34 AM Kaxil Naik <ka...@gmail.com> wrote:

> You will need to restart the webserver as well. It is possible that the DAG
> file is already in the DagBag used by the Airflow Webserver.
>
> Regarding your 3rd question, it is the expected behavior as your DAG file
> still exist, it is re-parsed and shown in the UI but the history is
> deleted.
>
> On Mon, Dec 9, 2019 at 12:38 PM Vishesh Jain <visheshj@twitter.com.invalid
> >
> wrote:
>
> > Hi Team
> >
> > I tried reproducing this issue on current Airflow Master. Looks like
> zombie
> > dags issue is not there. But I could identify a few other issues in it.
> >
> > List of issues in Master branch:
> >
> >    1. Dag Id change and restart scheduler
> >    In this, I tried renaming the dagId, and restarted the scheduler. It
> >    gave the following result <https://pasteboard.co/IKrQ0vR.png>
> >    On top of this, when I try to trigger the old dag, it broke and threw
> >    the error <https://pasteboard.co/IKrQP9C.png>
> >
> >    It again has 2 subcases:
> >    a. When dag doesn't have history, and you click on the zombie dag
> link.
> >    It shows the following message <https://pasteboard.co/IKrRoJy.png>
> >    b. When dag has a history or is in running state, it allows you to
> click
> >    the link and check views, code, and other options.
> >
> >    2. Dag file deletion and restart scheduler
> >    Ideally, this case should have behaved the same as Case-1, but it
> > didn't.
> >    Here, I was still able to see the active entry of the dag as above. On
> >    top of this, I was able to trigger the instance of the deleted file
> > after
> >    the scheduler restart.
> >
> >    3. Delete from UI
> >
> >    a. When dag file exists:
> >    In this case, even after deleting the dag from the UI, it cleans up
> the
> >    history but not the dag.
> >
> >    b. When dag file doesn't exist:
> >    In this case, dag is deleted forever along with history.
> >
> >
> >
> > Thanks
> > Vishesh Jain
> >
> >
> > On Fri, Dec 6, 2019 at 3:15 PM Kaxil Naik <ka...@gmail.com> wrote:
> >
> > > Did you reproduce this issue with the current Airflow Master? I think
> the
> > > issues was solved in master
> > >
> > > On Fri, Dec 6, 2019, 09:37 Vishesh Jain <vi...@twitter.com.invalid>
> > > wrote:
> > >
> > > > Hi Team
> > > >
> > > > I am working on JIRA AIRFLOW-5523
> > > > <https://issues.apache.org/jira/browse/AIRFLOW-5523>.
> > > >
> > > > As per the JIRA, the requirement is to automate the deletion for the
> > dags
> > > > whose dag file is not present.
> > > >
> > > > In the current state, when a user tries to delete a dag from UI whose
> > > file
> > > > is already present, it fails with the following message:
> > > >
> > > > [image: Screenshot 2019-12-05 at 2.49.03 PM.png]
> > > >
> > > > Once this dag file is removed, it becomes a "zombie" dag, as shown
> > below.
> > > > [image: image.png]
> > > >
> > > > In this state, this dag can be deleted from the UI and asks for the
> > > > confirmation with warning as:
> > > > [image: Screenshot 2019-12-05 at 2.55.25 PM.png]
> > > >
> > > > One way to remove these zombie dags is to filter these entries from
> > > > populating to the UI, ie, entries for which the dag file is not
> > present.
> > > > Also, after making this change, the cross button([image: Screenshot
> > > > 2019-12-05 at 2.58.30 PM.png]) will also won't make any sense, which
> > can
> > > > also be removed.
> > > >
> > > > Now if the user deletes the dag file, he won't see those dag entries
> > > > anymore in the UI.
> > > >
> > > > Kindly provide your feedback or concerns.
> > > >
> > > > Thanks ^ Regards
> > > > Vishesh Jain
> > > >
> > >
> >
>

Re: [DISCUSS] Approach for JIRA AIRFLOW-5523

Posted by Kaxil Naik <ka...@gmail.com>.
You will need to restart the webserver as well. It is possible that the DAG
file is already in the DagBag used by the Airflow Webserver.

Regarding your 3rd question, it is the expected behavior as your DAG file
still exist, it is re-parsed and shown in the UI but the history is deleted.

On Mon, Dec 9, 2019 at 12:38 PM Vishesh Jain <vi...@twitter.com.invalid>
wrote:

> Hi Team
>
> I tried reproducing this issue on current Airflow Master. Looks like zombie
> dags issue is not there. But I could identify a few other issues in it.
>
> List of issues in Master branch:
>
>    1. Dag Id change and restart scheduler
>    In this, I tried renaming the dagId, and restarted the scheduler. It
>    gave the following result <https://pasteboard.co/IKrQ0vR.png>
>    On top of this, when I try to trigger the old dag, it broke and threw
>    the error <https://pasteboard.co/IKrQP9C.png>
>
>    It again has 2 subcases:
>    a. When dag doesn't have history, and you click on the zombie dag link.
>    It shows the following message <https://pasteboard.co/IKrRoJy.png>
>    b. When dag has a history or is in running state, it allows you to click
>    the link and check views, code, and other options.
>
>    2. Dag file deletion and restart scheduler
>    Ideally, this case should have behaved the same as Case-1, but it
> didn't.
>    Here, I was still able to see the active entry of the dag as above. On
>    top of this, I was able to trigger the instance of the deleted file
> after
>    the scheduler restart.
>
>    3. Delete from UI
>
>    a. When dag file exists:
>    In this case, even after deleting the dag from the UI, it cleans up the
>    history but not the dag.
>
>    b. When dag file doesn't exist:
>    In this case, dag is deleted forever along with history.
>
>
>
> Thanks
> Vishesh Jain
>
>
> On Fri, Dec 6, 2019 at 3:15 PM Kaxil Naik <ka...@gmail.com> wrote:
>
> > Did you reproduce this issue with the current Airflow Master? I think the
> > issues was solved in master
> >
> > On Fri, Dec 6, 2019, 09:37 Vishesh Jain <vi...@twitter.com.invalid>
> > wrote:
> >
> > > Hi Team
> > >
> > > I am working on JIRA AIRFLOW-5523
> > > <https://issues.apache.org/jira/browse/AIRFLOW-5523>.
> > >
> > > As per the JIRA, the requirement is to automate the deletion for the
> dags
> > > whose dag file is not present.
> > >
> > > In the current state, when a user tries to delete a dag from UI whose
> > file
> > > is already present, it fails with the following message:
> > >
> > > [image: Screenshot 2019-12-05 at 2.49.03 PM.png]
> > >
> > > Once this dag file is removed, it becomes a "zombie" dag, as shown
> below.
> > > [image: image.png]
> > >
> > > In this state, this dag can be deleted from the UI and asks for the
> > > confirmation with warning as:
> > > [image: Screenshot 2019-12-05 at 2.55.25 PM.png]
> > >
> > > One way to remove these zombie dags is to filter these entries from
> > > populating to the UI, ie, entries for which the dag file is not
> present.
> > > Also, after making this change, the cross button([image: Screenshot
> > > 2019-12-05 at 2.58.30 PM.png]) will also won't make any sense, which
> can
> > > also be removed.
> > >
> > > Now if the user deletes the dag file, he won't see those dag entries
> > > anymore in the UI.
> > >
> > > Kindly provide your feedback or concerns.
> > >
> > > Thanks ^ Regards
> > > Vishesh Jain
> > >
> >
>

Re: [DISCUSS] Approach for JIRA AIRFLOW-5523

Posted by Vishesh Jain <vi...@twitter.com.INVALID>.
Hi Team

I tried reproducing this issue on current Airflow Master. Looks like zombie
dags issue is not there. But I could identify a few other issues in it.

List of issues in Master branch:

   1. Dag Id change and restart scheduler
   In this, I tried renaming the dagId, and restarted the scheduler. It
   gave the following result <https://pasteboard.co/IKrQ0vR.png>
   On top of this, when I try to trigger the old dag, it broke and threw
   the error <https://pasteboard.co/IKrQP9C.png>

   It again has 2 subcases:
   a. When dag doesn't have history, and you click on the zombie dag link.
   It shows the following message <https://pasteboard.co/IKrRoJy.png>
   b. When dag has a history or is in running state, it allows you to click
   the link and check views, code, and other options.

   2. Dag file deletion and restart scheduler
   Ideally, this case should have behaved the same as Case-1, but it didn't.
   Here, I was still able to see the active entry of the dag as above. On
   top of this, I was able to trigger the instance of the deleted file after
   the scheduler restart.

   3. Delete from UI

   a. When dag file exists:
   In this case, even after deleting the dag from the UI, it cleans up the
   history but not the dag.

   b. When dag file doesn't exist:
   In this case, dag is deleted forever along with history.



Thanks
Vishesh Jain


On Fri, Dec 6, 2019 at 3:15 PM Kaxil Naik <ka...@gmail.com> wrote:

> Did you reproduce this issue with the current Airflow Master? I think the
> issues was solved in master
>
> On Fri, Dec 6, 2019, 09:37 Vishesh Jain <vi...@twitter.com.invalid>
> wrote:
>
> > Hi Team
> >
> > I am working on JIRA AIRFLOW-5523
> > <https://issues.apache.org/jira/browse/AIRFLOW-5523>.
> >
> > As per the JIRA, the requirement is to automate the deletion for the dags
> > whose dag file is not present.
> >
> > In the current state, when a user tries to delete a dag from UI whose
> file
> > is already present, it fails with the following message:
> >
> > [image: Screenshot 2019-12-05 at 2.49.03 PM.png]
> >
> > Once this dag file is removed, it becomes a "zombie" dag, as shown below.
> > [image: image.png]
> >
> > In this state, this dag can be deleted from the UI and asks for the
> > confirmation with warning as:
> > [image: Screenshot 2019-12-05 at 2.55.25 PM.png]
> >
> > One way to remove these zombie dags is to filter these entries from
> > populating to the UI, ie, entries for which the dag file is not present.
> > Also, after making this change, the cross button([image: Screenshot
> > 2019-12-05 at 2.58.30 PM.png]) will also won't make any sense, which can
> > also be removed.
> >
> > Now if the user deletes the dag file, he won't see those dag entries
> > anymore in the UI.
> >
> > Kindly provide your feedback or concerns.
> >
> > Thanks ^ Regards
> > Vishesh Jain
> >
>

Re: [DISCUSS] Approach for JIRA AIRFLOW-5523

Posted by Kaxil Naik <ka...@gmail.com>.
Did you reproduce this issue with the current Airflow Master? I think the
issues was solved in master

On Fri, Dec 6, 2019, 09:37 Vishesh Jain <vi...@twitter.com.invalid>
wrote:

> Hi Team
>
> I am working on JIRA AIRFLOW-5523
> <https://issues.apache.org/jira/browse/AIRFLOW-5523>.
>
> As per the JIRA, the requirement is to automate the deletion for the dags
> whose dag file is not present.
>
> In the current state, when a user tries to delete a dag from UI whose file
> is already present, it fails with the following message:
>
> [image: Screenshot 2019-12-05 at 2.49.03 PM.png]
>
> Once this dag file is removed, it becomes a "zombie" dag, as shown below.
> [image: image.png]
>
> In this state, this dag can be deleted from the UI and asks for the
> confirmation with warning as:
> [image: Screenshot 2019-12-05 at 2.55.25 PM.png]
>
> One way to remove these zombie dags is to filter these entries from
> populating to the UI, ie, entries for which the dag file is not present.
> Also, after making this change, the cross button([image: Screenshot
> 2019-12-05 at 2.58.30 PM.png]) will also won't make any sense, which can
> also be removed.
>
> Now if the user deletes the dag file, he won't see those dag entries
> anymore in the UI.
>
> Kindly provide your feedback or concerns.
>
> Thanks ^ Regards
> Vishesh Jain
>