You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Jarek Potiuk <Ja...@polidea.com> on 2019/12/29 12:26:59 UTC

[PROPOSAL] [FUTURE] Semi-automated tool for migration to 2.0.0

I thought (and discussed with the users at various conferences) that we
should make it super-easy to migrate to Airflow 2.0 when we release it.
There is a number of incompatibilities that we mention in UPDATING.md so we
have quite a good 'base' for the list of incompatibilities but I think
people have 100s or 1000s of DAGs sometimes so we should do better than
that and provide a semi-automated migration tool for them.

I'd love to hear what you think about it.

I created a JIRA issue for it:
https://issues.apache.org/jira/browse/AIRFLOW-6390

Here is what it says:

Before releasing 2.0.0, we should create DAG migration tool for migrating
DAGs to Apache Airflow 2.0 based on UPDATING.md and incompatibilities
introduced in 2.0.0

It should mostly automate migrating DAGs and correcting the DAGs they have
where needed and print warnings for all cases that need some manual review
and corrections (with appropriate instructions).

All the changes performed automatically should be explained and described
and logged so that you can refer to changes made to your DAG.

The instructions in case of manual corrections needed should be more
detailed than just describing the changes in UPDATING.md. It should mention
consequences of such changes, reasoning and explain what the user might
expect after migration.

J.

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: [PROPOSAL] [FUTURE] Semi-automated tool for migration to 2.0.0

Posted by Jarek Potiuk <Ja...@polidea.com>.
Correct link: https://github.com/apache/airflow/pull/6955


On Mon, Dec 30, 2019 at 3:20 PM Jarek Potiuk <Ja...@polidea.com>
wrote:

> +1 Anton.
>
> We've included it in the new simplified Github PR template that has a
> chance to actually be treated as checklist :)
> See the description here: https://github.com/apache/airflow/pull/955 .
>
> And after some thoughts I am also ok with adding the comment in
> UPDATING.md by Kamil where people
> should provide more information (see below). I think that's good if we can
> add more information and we should also
> update the existing descriptions there.
>
> <!--
>
> I'm glad you want to write a new note. Remember that this note is intended for users.
> Make sure it contains the following information:
>
> - [ ] Previous behaviors
> - [ ] New behaviors
> - [ ] If possible, a simple example of how to migrate. This may include a simple code example.
> - [ ] If possible, the benefit for the user after migration e.g. "we want to make these changes to unify class names."
> - [ ] If possible, the reason for the change, which adds more context to that interested, e.g. reference for Airflow Improvment Proposal.
>
> More tips can be found in the guide:https://developers.google.com/style/inclusive-documentation
>
> -->
>
>
> J.
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: [PROPOSAL] [FUTURE] Semi-automated tool for migration to 2.0.0

Posted by Jarek Potiuk <Ja...@polidea.com>.
+1 Anton.

We've included it in the new simplified Github PR template that has a
chance to actually be treated as checklist :)
See the description here: https://github.com/apache/airflow/pull/955 .

And after some thoughts I am also ok with adding the comment in UPDATING.md
by Kamil where people
should provide more information (see below). I think that's good if we can
add more information and we should also
update the existing descriptions there.

<!--

I'm glad you want to write a new note. Remember that this note is
intended for users.
Make sure it contains the following information:

- [ ] Previous behaviors
- [ ] New behaviors
- [ ] If possible, a simple example of how to migrate. This may
include a simple code example.
- [ ] If possible, the benefit for the user after migration e.g. "we
want to make these changes to unify class names."
- [ ] If possible, the reason for the change, which adds more context
to that interested, e.g. reference for Airflow Improvment Proposal.

More tips can be found in the
guide:https://developers.google.com/style/inclusive-documentation

-->


J.

Re: [PROPOSAL] [FUTURE] Semi-automated tool for migration to 2.0.0

Posted by Anton Zayniev <an...@gmail.com>.
I think we can add smth like "Airflow 2.0 migration description for
breaking changes" to PR checklist. Users have to add to jira issue
describing migration changes related their PR in order to get their PR
accepted. So smb can work on that issue and close it, I think that could be
a great set of tasks for new contributors.
Anton

On Mon, Dec 30, 2019, 07:48 Jarek Potiuk <Ja...@polidea.com> wrote:

> I agree we need more detailed instructions when we release 2.0.
>
> I think however we should wait with describing all details and instructions
> until we are closer to 2.0 release and
> we close the list of incompatibilities. I think for now just describing
> what changed should be enough.
>
> We might yet want to revert some of the changes after we test it or there
> might be some sequence
> of operations to execute in the right order. Or we can combine several
> notes in Updating.md in one
> So I think it's best to do this kind of detailed description when we are
> preparing to actual release.
>
> J.
>
> On Mon, Dec 30, 2019 at 1:44 AM Kamil Breguła <ka...@polidea.com>
> wrote:
>
> > Hello,
> >
> > I think that before automatic tools, we should try to improve the manual
> > process. Some notes in the UPDATIND.md file are laconic, enigmatic and do
> > not allow you to migrate easily.
> > I have created PR, which contains some tips
> > https://github.com/apache/airflow/pull/6960/files
> > If we develop a precise manual migration process, we can automate it. I
> > think that it will be difficult for us to develop an automatic tool if
> the
> > manual process is complex, problematic and hardly understood by the user.
> >
> > Best regards,
> > Kamil
> >
> >
> > On Sun, Dec 29, 2019 at 1:27 PM Jarek Potiuk <Ja...@polidea.com>
> > wrote:
> >
> > > I thought (and discussed with the users at various conferences) that we
> > > should make it super-easy to migrate to Airflow 2.0 when we release it.
> > > There is a number of incompatibilities that we mention in UPDATING.md
> so
> > we
> > > have quite a good 'base' for the list of incompatibilities but I think
> > > people have 100s or 1000s of DAGs sometimes so we should do better than
> > > that and provide a semi-automated migration tool for them.
> > >
> > > I'd love to hear what you think about it.
> > >
> > > I created a JIRA issue for it:
> > > https://issues.apache.org/jira/browse/AIRFLOW-6390
> > >
> > > Here is what it says:
> > >
> > > Before releasing 2.0.0, we should create DAG migration tool for
> migrating
> > > DAGs to Apache Airflow 2.0 based on UPDATING.md and incompatibilities
> > > introduced in 2.0.0
> > >
> > > It should mostly automate migrating DAGs and correcting the DAGs they
> > have
> > > where needed and print warnings for all cases that need some manual
> > review
> > > and corrections (with appropriate instructions).
> > >
> > > All the changes performed automatically should be explained and
> described
> > > and logged so that you can refer to changes made to your DAG.
> > >
> > > The instructions in case of manual corrections needed should be more
> > > detailed than just describing the changes in UPDATING.md. It should
> > mention
> > > consequences of such changes, reasoning and explain what the user might
> > > expect after migration.
> > >
> > > J.
> > >
> > > --
> > >
> > > Jarek Potiuk
> > > Polidea <https://www.polidea.com/> | Principal Software Engineer
> > >
> > > M: +48 660 796 129 <+48660796129>
> > > [image: Polidea] <https://www.polidea.com/>
> > >
> >
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>

Re: [PROPOSAL] [FUTURE] Semi-automated tool for migration to 2.0.0

Posted by Jarek Potiuk <Ja...@polidea.com>.
I agree we need more detailed instructions when we release 2.0.

I think however we should wait with describing all details and instructions
until we are closer to 2.0 release and
we close the list of incompatibilities. I think for now just describing
what changed should be enough.

We might yet want to revert some of the changes after we test it or there
might be some sequence
of operations to execute in the right order. Or we can combine several
notes in Updating.md in one
So I think it's best to do this kind of detailed description when we are
preparing to actual release.

J.

On Mon, Dec 30, 2019 at 1:44 AM Kamil Breguła <ka...@polidea.com>
wrote:

> Hello,
>
> I think that before automatic tools, we should try to improve the manual
> process. Some notes in the UPDATIND.md file are laconic, enigmatic and do
> not allow you to migrate easily.
> I have created PR, which contains some tips
> https://github.com/apache/airflow/pull/6960/files
> If we develop a precise manual migration process, we can automate it. I
> think that it will be difficult for us to develop an automatic tool if the
> manual process is complex, problematic and hardly understood by the user.
>
> Best regards,
> Kamil
>
>
> On Sun, Dec 29, 2019 at 1:27 PM Jarek Potiuk <Ja...@polidea.com>
> wrote:
>
> > I thought (and discussed with the users at various conferences) that we
> > should make it super-easy to migrate to Airflow 2.0 when we release it.
> > There is a number of incompatibilities that we mention in UPDATING.md so
> we
> > have quite a good 'base' for the list of incompatibilities but I think
> > people have 100s or 1000s of DAGs sometimes so we should do better than
> > that and provide a semi-automated migration tool for them.
> >
> > I'd love to hear what you think about it.
> >
> > I created a JIRA issue for it:
> > https://issues.apache.org/jira/browse/AIRFLOW-6390
> >
> > Here is what it says:
> >
> > Before releasing 2.0.0, we should create DAG migration tool for migrating
> > DAGs to Apache Airflow 2.0 based on UPDATING.md and incompatibilities
> > introduced in 2.0.0
> >
> > It should mostly automate migrating DAGs and correcting the DAGs they
> have
> > where needed and print warnings for all cases that need some manual
> review
> > and corrections (with appropriate instructions).
> >
> > All the changes performed automatically should be explained and described
> > and logged so that you can refer to changes made to your DAG.
> >
> > The instructions in case of manual corrections needed should be more
> > detailed than just describing the changes in UPDATING.md. It should
> mention
> > consequences of such changes, reasoning and explain what the user might
> > expect after migration.
> >
> > J.
> >
> > --
> >
> > Jarek Potiuk
> > Polidea <https://www.polidea.com/> | Principal Software Engineer
> >
> > M: +48 660 796 129 <+48660796129>
> > [image: Polidea] <https://www.polidea.com/>
> >
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: [PROPOSAL] [FUTURE] Semi-automated tool for migration to 2.0.0

Posted by Kamil Breguła <ka...@polidea.com>.
Hello,

I think that before automatic tools, we should try to improve the manual
process. Some notes in the UPDATIND.md file are laconic, enigmatic and do
not allow you to migrate easily.
I have created PR, which contains some tips
https://github.com/apache/airflow/pull/6960/files
If we develop a precise manual migration process, we can automate it. I
think that it will be difficult for us to develop an automatic tool if the
manual process is complex, problematic and hardly understood by the user.

Best regards,
Kamil


On Sun, Dec 29, 2019 at 1:27 PM Jarek Potiuk <Ja...@polidea.com>
wrote:

> I thought (and discussed with the users at various conferences) that we
> should make it super-easy to migrate to Airflow 2.0 when we release it.
> There is a number of incompatibilities that we mention in UPDATING.md so we
> have quite a good 'base' for the list of incompatibilities but I think
> people have 100s or 1000s of DAGs sometimes so we should do better than
> that and provide a semi-automated migration tool for them.
>
> I'd love to hear what you think about it.
>
> I created a JIRA issue for it:
> https://issues.apache.org/jira/browse/AIRFLOW-6390
>
> Here is what it says:
>
> Before releasing 2.0.0, we should create DAG migration tool for migrating
> DAGs to Apache Airflow 2.0 based on UPDATING.md and incompatibilities
> introduced in 2.0.0
>
> It should mostly automate migrating DAGs and correcting the DAGs they have
> where needed and print warnings for all cases that need some manual review
> and corrections (with appropriate instructions).
>
> All the changes performed automatically should be explained and described
> and logged so that you can refer to changes made to your DAG.
>
> The instructions in case of manual corrections needed should be more
> detailed than just describing the changes in UPDATING.md. It should mention
> consequences of such changes, reasoning and explain what the user might
> expect after migration.
>
> J.
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>

Re: [PROPOSAL] [FUTURE] Semi-automated tool for migration to 2.0.0

Posted by Kaxil Naik <ka...@gmail.com>.
Yes definitely, I had thought of something like py2to3 script. We might
want to create something similar.



On Sun, Dec 29, 2019, 14:17 Jarek Potiuk <Ja...@polidea.com> wrote:

> Great Claudio! Once we get closer to starting it, we can start some joined
> work on it :).
>
> I think we also will need the support of a number of "friendly" users with
> that.
>
> We can provide some basic migration tool initially but it will take quite a
> few iterations to perfect it and handle all edge cases that we have not
> thought about. So we will need to release some beta-versions to test the
> tool on real-live DAGs by our users and handle those cases.
>
> J.
>
>
> On Sun, Dec 29, 2019 at 3:09 PM Claudio <cl...@yahoo.it.invalid>
> wrote:
>
> > +1 Totally agree.Really would to work on this tool!Have a nice
> day!Claudio
> > -------- Messaggio originale --------Da: Jarek Potiuk <
> > Jarek.Potiuk@polidea.com> Data: 29/12/19  13:27  (GMT+01:00) A:
> > dev@airflow.apache.org Oggetto: [PROPOSAL] [FUTURE] Semi-automated tool
> > for migration to 2.0.0 I thought (and discussed with the users at various
> > conferences) that weshould make it super-easy to migrate to Airflow 2.0
> > when we release it.There is a number of incompatibilities that we mention
> > in UPDATING.md so wehave quite a good 'base' for the list of
> > incompatibilities but I thinkpeople have 100s or 1000s of DAGs sometimes
> so
> > we should do better thanthat and provide a semi-automated migration tool
> > for them.I'd love to hear what you think about it.I created a JIRA issue
> > for it:https://issues.apache.org/jira/browse/AIRFLOW-6390Here is what it
> > says:Before releasing 2.0.0, we should create DAG migration tool for
> > migratingDAGs to Apache Airflow 2.0 based on UPDATING.md and
> > incompatibilitiesintroduced in 2.0.0It should mostly automate migrating
> > DAGs and correcting the DAGs they havewhere needed and print warnings for
> > all cases that need some manual reviewand corrections (with appropriate
> > instructions).All the changes performed automatically should be explained
> > and describedand logged so that you can refer to changes made to your
> > DAG.The instructions in case of manual corrections needed should be
> > moredetailed than just describing the changes in UPDATING.md. It should
> > mentionconsequences of such changes, reasoning and explain what the user
> > mightexpect after migration.J.-- Jarek PotiukPolidea <
> > https://www.polidea.com/> | Principal Software EngineerM: +48 660 796
> 129
> > <+48660796129>[image: Polidea] <https://www.polidea.com/>
>
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>

Re: [PROPOSAL] [FUTURE] Semi-automated tool for migration to 2.0.0

Posted by Jarek Potiuk <Ja...@polidea.com>.
Great Claudio! Once we get closer to starting it, we can start some joined
work on it :).

I think we also will need the support of a number of "friendly" users with
that.

We can provide some basic migration tool initially but it will take quite a
few iterations to perfect it and handle all edge cases that we have not
thought about. So we will need to release some beta-versions to test the
tool on real-live DAGs by our users and handle those cases.

J.


On Sun, Dec 29, 2019 at 3:09 PM Claudio <cl...@yahoo.it.invalid> wrote:

> +1 Totally agree.Really would to work on this tool!Have a nice day!Claudio
> -------- Messaggio originale --------Da: Jarek Potiuk <
> Jarek.Potiuk@polidea.com> Data: 29/12/19  13:27  (GMT+01:00) A:
> dev@airflow.apache.org Oggetto: [PROPOSAL] [FUTURE] Semi-automated tool
> for migration to 2.0.0 I thought (and discussed with the users at various
> conferences) that weshould make it super-easy to migrate to Airflow 2.0
> when we release it.There is a number of incompatibilities that we mention
> in UPDATING.md so wehave quite a good 'base' for the list of
> incompatibilities but I thinkpeople have 100s or 1000s of DAGs sometimes so
> we should do better thanthat and provide a semi-automated migration tool
> for them.I'd love to hear what you think about it.I created a JIRA issue
> for it:https://issues.apache.org/jira/browse/AIRFLOW-6390Here is what it
> says:Before releasing 2.0.0, we should create DAG migration tool for
> migratingDAGs to Apache Airflow 2.0 based on UPDATING.md and
> incompatibilitiesintroduced in 2.0.0It should mostly automate migrating
> DAGs and correcting the DAGs they havewhere needed and print warnings for
> all cases that need some manual reviewand corrections (with appropriate
> instructions).All the changes performed automatically should be explained
> and describedand logged so that you can refer to changes made to your
> DAG.The instructions in case of manual corrections needed should be
> moredetailed than just describing the changes in UPDATING.md. It should
> mentionconsequences of such changes, reasoning and explain what the user
> mightexpect after migration.J.-- Jarek PotiukPolidea <
> https://www.polidea.com/> | Principal Software EngineerM: +48 660 796 129
> <+48660796129>[image: Polidea] <https://www.polidea.com/>



-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

RE: [PROPOSAL] [FUTURE] Semi-automated tool for migration to 2.0.0

Posted by Claudio <cl...@yahoo.it.INVALID>.
+1 Totally agree.Really would to work on this tool!Have a nice day!Claudio
-------- Messaggio originale --------Da: Jarek Potiuk <Ja...@polidea.com> Data: 29/12/19  13:27  (GMT+01:00) A: dev@airflow.apache.org Oggetto: [PROPOSAL] [FUTURE] Semi-automated tool for migration to 2.0.0 I thought (and discussed with the users at various conferences) that weshould make it super-easy to migrate to Airflow 2.0 when we release it.There is a number of incompatibilities that we mention in UPDATING.md so wehave quite a good 'base' for the list of incompatibilities but I thinkpeople have 100s or 1000s of DAGs sometimes so we should do better thanthat and provide a semi-automated migration tool for them.I'd love to hear what you think about it.I created a JIRA issue for it:https://issues.apache.org/jira/browse/AIRFLOW-6390Here is what it says:Before releasing 2.0.0, we should create DAG migration tool for migratingDAGs to Apache Airflow 2.0 based on UPDATING.md and incompatibilitiesintroduced in 2.0.0It should mostly automate migrating DAGs and correcting the DAGs they havewhere needed and print warnings for all cases that need some manual reviewand corrections (with appropriate instructions).All the changes performed automatically should be explained and describedand logged so that you can refer to changes made to your DAG.The instructions in case of manual corrections needed should be moredetailed than just describing the changes in UPDATING.md. It should mentionconsequences of such changes, reasoning and explain what the user mightexpect after migration.J.-- Jarek PotiukPolidea <https://www.polidea.com/> | Principal Software EngineerM: +48 660 796 129 <+48660796129>[image: Polidea] <https://www.polidea.com/>