You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Kaxil Naik <ka...@apache.org> on 2020/09/16 15:12:42 UTC

[Meeting Notes] Airflow 2.0 Dev Call #4 - 14 Sep 2020

Hi all,

I have created a document to summarize the discussion from our third dev
call for Airflow 2.0.

Thank you all who joined the call.

*Doc Link*:
https://cwiki.apache.org/confluence/display/AIRFLOW/Meeting+Notes#MeetingNotes-#4:14Sep2020

To all those who attended, can you please double-check and add if I have
missed anything?

To all those who didn't join, if you disagree to anything in
the Summary please voice your opinion.

Also please let me know if someone wants to include an item in Next call's
Agenda.

Including the Summary here too (might potentially break formatting):

*Key Decisions*

   - *Updates*
      - Airflow v2-0-test branch
      <https://github.com/apache/airflow/commits/v2-0-test> has already
      been cut and currently manually rebased on top of the Master.
Currently, we
      don't run CI as the branch is in-sync with Master. As soon as we
have a PR
      / commit that we don't want to have it in 2.0 we will diverge v2-0-test
      branch from Master and start running tests against it.
      - The upgrade-check PR <https://github.com/apache/airflow/pull/9467> was
      merged, we now need to define more rules to add more checks.
   - *API*
      - Progress:
         - Project Board: https://github.com/apache/airflow/projects/1
            - The issues labelled with "Enhancement" are not a requirement
            for 2.0
         - Endpoints:
            - Task Instance Endpoint
            <https://github.com/apache/airflow/pull/9597> is WIP, all the
            other endpoints have been implemented.
         - Permissions Model:
            - On-going discussion on the PR
            <https://github.com/apache/airflow/pull/10594> but close to
            completion.
            - The next piece of work to be done is migrating existing Views
            to use resource-based permissions. (Github issue
            <https://github.com/apache/airflow/issues/10469>). This is
            mainly for standardizing the permissions model across API and UI.
         - *Improvements to SubDags / Concept of TaskGroup*
      - AIP-34
      <https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-34+TaskGroup%3A+A+UI+task+grouping+concept+as+an+alternative+to+SubDagOperator>
       | PR <https://github.com/apache/airflow/pull/10153> introduced the
      concepts of TaskGroup and will be *included in Airflow 2.0*.
         - The PR implements TaskGroups for Graph View, the Tree View will
         be implemented in follow-up PRs.
      - Follow-up items from the discussion:
         - Discuss on mailing list whether we should deprecate SubDags in
         favour of TaskGroup in 2.0 or wait until Airflow 2.1 or 2.2
         - Add docs around when to use TaskGroup vs SubDag and potentially
         listing PROs and CONS.
      - *Scheduler HA *(AIP-15
   <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103092651>
    )
      - A Draft PR <https://github.com/apache/airflow/pull/10956> has been
      created to enable code reviews and to allow the members of the
community to
      start testing it with various setups.
      - To get the most benefit of Scheduler HA on MySQL, users will need
      to use MySQL 8. This is because MySQL 5.7 does not support SKIP LOCK
      <https://dev.mysql.com/doc/refman/8.0/en/innodb-locking-reads.html#innodb-locking-reads-nowait-skip-locked>feature
      but note that *MySQL 5.7 will still continue to work with at least
      the same or improved performance as now*.
      - Astronomer has done performance testing with different Scenarios
      and will publish benchmarks over the coming weeks. Google Composer Team +
      Polidea said that they would be happy to carry out various tests for
      Scheduler HA as well.
      - There were some concerns raised around LOCKING Timeout periods and
      the usage of DAG Serialization. More testing in the upcoming weeks should
      help mitigate any concerns and help fix the bugs if discovered.
      - *Docs:*
         - Explicitly mention that for HA Scheduler reads some of the
         properties from serialized_dag table. Users can turn on/off DAG
         Serialization in the Webserver but the Scheduler will
continue using it.
         - Do we recommend 2 schedulers for Production deployments?
         - X Schedulers vs single Scheduler. Use case when one would be
         better than the other.
            - Some kind of Bell Curve showing an increase in Schedulers
            stops improving performance and maybe also degrades. This
is intended to
            give guidance around what number of schedulers to run
based on expected
            load, since this decision could be based on multiple factors.
         - Follow up items:
         - Create mailing list thread to discuss "Removing Pickling from
         Airflow 2.0". Currently, pickled dags are only supported by
CeleryExecutor
         and we have a flag on *airflow scheduler
         <https://airflow.readthedocs.io/en/latest/cli-ref.html#scheduler>
*(--do-pickle)
         and "--ship-dag" on *airflow tasks run
         <https://airflow.readthedocs.io/en/latest/cli-ref.html#run> *command.
         If we want to remove pickling Airflow 2.0 is the right time
or we shouldn't
         do it until 3.0
      - *Helm Chart*
      - We will continue focusing on getting Airflow 2.0 out so the first
      official release of Helm Chart might need to wait.
      - The issue with Helm Chart sources was fixed and there are no
      blockers currently if we were to release it at some point in the near
      future.
      - Enhancements (but not blockers) are:
         - Better Test Coverage with integration tests
         - Docs pointing to the chart on the Airflow Website or the docsite
      - The artifacts for the Helm chart would be published at
      https://downloads.apache.org/airflow/
      - There is still an open question around *Helm Chart Versioning
      Policy *i.e. do we want to tie-in Airflow Versions with Helm Chart?
      Or do we just start from *1.0.0? * This needs to be decided before
      the release of the Helm Chart.



*Things to Discuss Next*

   - *21 September (Subject to Change)*
      - Finish up open discussion items from the earlier meeting if not yet
      resolved:
         - Providers versioning,
         - SubDag deprecation,
         - Helm Chart release,
         - REST API permissions
         - Docs changes
      - UI Changes for 2.0
         - Minimum effort changes: CSS/colours/spacing to make the UI look
         a bit modern
      - Process:
         - When should we defer the in-scope items to post-2.0
            - Completion by a date?
            - Progress by a date?


Regards,
Kaxil

Re: [Meeting Notes] Airflow 2.0 Dev Call #4 - 14 Sep 2020

Posted by Jarek Potiuk <Ja...@polidea.com>.
I think there was one point at the last meeting that we have not discussed
as we had no time - the documentation. I have some input to the discussion
so I thought I share it here before the next one. I believe we ended up of
"we have to think and review what to improve".

I personally think we have rather good "contributor's" documentation but we
have still some gaps in the "user docs".

I looked at "my parts" and tried to think in the "User" way. I do not do a
lot of user-facing stuff, but one specific part that is user-facing (and
are parts of 2.0 deliver) are the Docker images.

Over the last months when we released the image I gathered some feedback
and listened to complaints :) of users and I turned those into better
documentation: https://github.com/apache/airflow/pull/10998 .

It is mostly about extracting the parts that are interesting to the users
into the "docs" and leaving the "developer" details in the .rst files in
the non-docs part of the repo, but also adding some short "guides" for
users how they can use the image, before diving into details.
I created a separate "production-deployment.rst" doc where we might
explain more details about Helm, Kubernetes, etc. and link to other,
relevant parts of the documentation.

I think this might be a common theme, where some documents might be already
there but not easy to find by the users.

I'd love to hear comments of others - either here or in the PR.

J.


On Wed, Sep 16, 2020 at 8:21 PM Jarek Potiuk <Ja...@polidea.com>
wrote:

> Good summary. Thanks, Kaxil!
>
> On Wed, Sep 16, 2020 at 5:12 PM Kaxil Naik <ka...@apache.org> wrote:
> >
> > Hi all,
> >
> > I have created a document to summarize the discussion from our third dev
> > call for Airflow 2.0.
> >
> > Thank you all who joined the call.
> >
> > *Doc Link*:
> >
> https://cwiki.apache.org/confluence/display/AIRFLOW/Meeting+Notes#MeetingNotes-#4:14Sep2020
> <https://cwiki.apache.org/confluence/display/AIRFLOW/Meeting+Notes#MeetingNotes-%234:14Sep2020>
> >
> > To all those who attended, can you please double-check and add if I have
> > missed anything?
> >
> > To all those who didn't join, if you disagree to anything in
> > the Summary please voice your opinion.
> >
> > Also please let me know if someone wants to include an item in Next
> call's
> > Agenda.
> >
> > Including the Summary here too (might potentially break formatting):
> >
> > *Key Decisions*
> >
> >    - *Updates*
> >       - Airflow v2-0-test branch
> >       <https://github.com/apache/airflow/commits/v2-0-test> has already
> >       been cut and currently manually rebased on top of the Master.
> > Currently, we
> >       don't run CI as the branch is in-sync with Master. As soon as we
> > have a PR
> >       / commit that we don't want to have it in 2.0 we will diverge
> v2-0-test
> >       branch from Master and start running tests against it.
> >       - The upgrade-check PR <
> https://github.com/apache/airflow/pull/9467> was
> >       merged, we now need to define more rules to add more checks.
> >    - *API*
> >       - Progress:
> >          - Project Board: https://github.com/apache/airflow/projects/1
> >             - The issues labelled with "Enhancement" are not a
> requirement
> >             for 2.0
> >          - Endpoints:
> >             - Task Instance Endpoint
> >             <https://github.com/apache/airflow/pull/9597> is WIP, all
> the
> >             other endpoints have been implemented.
> >          - Permissions Model:
> >             - On-going discussion on the PR
> >             <https://github.com/apache/airflow/pull/10594> but close to
> >             completion.
> >             - The next piece of work to be done is migrating existing
> Views
> >             to use resource-based permissions. (Github issue
> >             <https://github.com/apache/airflow/issues/10469>). This is
> >             mainly for standardizing the permissions model across API
> and UI.
> >          - *Improvements to SubDags / Concept of TaskGroup*
> >       - AIP-34
> >       <
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-34+TaskGroup%3A+A+UI+task+grouping+concept+as+an+alternative+to+SubDagOperator
> >
> >        | PR <https://github.com/apache/airflow/pull/10153> introduced
> the
> >       concepts of TaskGroup and will be *included in Airflow 2.0*.
> >          - The PR implements TaskGroups for Graph View, the Tree View
> will
> >          be implemented in follow-up PRs.
> >       - Follow-up items from the discussion:
> >          - Discuss on mailing list whether we should deprecate SubDags in
> >          favour of TaskGroup in 2.0 or wait until Airflow 2.1 or 2.2
> >          - Add docs around when to use TaskGroup vs SubDag and
> potentially
> >          listing PROs and CONS.
> >       - *Scheduler HA *(AIP-15
> >    <
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103092651
> >
> >     )
> >       - A Draft PR <https://github.com/apache/airflow/pull/10956> has
> been
> >       created to enable code reviews and to allow the members of the
> > community to
> >       start testing it with various setups.
> >       - To get the most benefit of Scheduler HA on MySQL, users will need
> >       to use MySQL 8. This is because MySQL 5.7 does not support SKIP
> LOCK
> >       <
> https://dev.mysql.com/doc/refman/8.0/en/innodb-locking-reads.html#innodb-locking-reads-nowait-skip-locked
> >feature
> >       but note that *MySQL 5.7 will still continue to work with at least
> >       the same or improved performance as now*.
> >       - Astronomer has done performance testing with different Scenarios
> >       and will publish benchmarks over the coming weeks. Google Composer
> Team +
> >       Polidea said that they would be happy to carry out various tests
> for
> >       Scheduler HA as well.
> >       - There were some concerns raised around LOCKING Timeout periods
> and
> >       the usage of DAG Serialization. More testing in the upcoming weeks
> should
> >       help mitigate any concerns and help fix the bugs if discovered.
> >       - *Docs:*
> >          - Explicitly mention that for HA Scheduler reads some of the
> >          properties from serialized_dag table. Users can turn on/off DAG
> >          Serialization in the Webserver but the Scheduler will
> > continue using it.
> >          - Do we recommend 2 schedulers for Production deployments?
> >          - X Schedulers vs single Scheduler. Use case when one would be
> >          better than the other.
> >             - Some kind of Bell Curve showing an increase in Schedulers
> >             stops improving performance and maybe also degrades. This
> > is intended to
> >             give guidance around what number of schedulers to run
> > based on expected
> >             load, since this decision could be based on multiple factors.
> >          - Follow up items:
> >          - Create mailing list thread to discuss "Removing Pickling from
> >          Airflow 2.0". Currently, pickled dags are only supported by
> > CeleryExecutor
> >          and we have a flag on *airflow scheduler
> >          <
> https://airflow.readthedocs.io/en/latest/cli-ref.html#scheduler>
> > *(--do-pickle)
> >          and "--ship-dag" on *airflow tasks run
> >          <https://airflow.readthedocs.io/en/latest/cli-ref.html#run>
> *command.
> >          If we want to remove pickling Airflow 2.0 is the right time
> > or we shouldn't
> >          do it until 3.0
> >       - *Helm Chart*
> >       - We will continue focusing on getting Airflow 2.0 out so the first
> >       official release of Helm Chart might need to wait.
> >       - The issue with Helm Chart sources was fixed and there are no
> >       blockers currently if we were to release it at some point in the
> near
> >       future.
> >       - Enhancements (but not blockers) are:
> >          - Better Test Coverage with integration tests
> >          - Docs pointing to the chart on the Airflow Website or the
> docsite
> >       - The artifacts for the Helm chart would be published at
> >       https://downloads.apache.org/airflow/
> >       - There is still an open question around *Helm Chart Versioning
> >       Policy *i.e. do we want to tie-in Airflow Versions with Helm Chart?
> >       Or do we just start from *1.0.0? * This needs to be decided before
> >       the release of the Helm Chart.
> >
> >
> >
> > *Things to Discuss Next*
> >
> >    - *21 September (Subject to Change)*
> >       - Finish up open discussion items from the earlier meeting if not
> yet
> >       resolved:
> >          - Providers versioning,
> >          - SubDag deprecation,
> >          - Helm Chart release,
> >          - REST API permissions
> >          - Docs changes
> >       - UI Changes for 2.0
> >          - Minimum effort changes: CSS/colours/spacing to make the UI
> look
> >          a bit modern
> >       - Process:
> >          - When should we defer the in-scope items to post-2.0
> >             - Completion by a date?
> >             - Progress by a date?
> >
> >
> > Regards,
> > Kaxil
>
>
>
> --
>
> Jarek Potiuk
> Polidea | Principal Software Engineer
>
> M: +48 660 796 129
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: [Meeting Notes] Airflow 2.0 Dev Call #4 - 14 Sep 2020

Posted by Jarek Potiuk <Ja...@polidea.com>.
Good summary. Thanks, Kaxil!

On Wed, Sep 16, 2020 at 5:12 PM Kaxil Naik <ka...@apache.org> wrote:
>
> Hi all,
>
> I have created a document to summarize the discussion from our third dev
> call for Airflow 2.0.
>
> Thank you all who joined the call.
>
> *Doc Link*:
> https://cwiki.apache.org/confluence/display/AIRFLOW/Meeting+Notes#MeetingNotes-#4:14Sep2020
>
> To all those who attended, can you please double-check and add if I have
> missed anything?
>
> To all those who didn't join, if you disagree to anything in
> the Summary please voice your opinion.
>
> Also please let me know if someone wants to include an item in Next call's
> Agenda.
>
> Including the Summary here too (might potentially break formatting):
>
> *Key Decisions*
>
>    - *Updates*
>       - Airflow v2-0-test branch
>       <https://github.com/apache/airflow/commits/v2-0-test> has already
>       been cut and currently manually rebased on top of the Master.
> Currently, we
>       don't run CI as the branch is in-sync with Master. As soon as we
> have a PR
>       / commit that we don't want to have it in 2.0 we will diverge v2-0-test
>       branch from Master and start running tests against it.
>       - The upgrade-check PR <https://github.com/apache/airflow/pull/9467> was
>       merged, we now need to define more rules to add more checks.
>    - *API*
>       - Progress:
>          - Project Board: https://github.com/apache/airflow/projects/1
>             - The issues labelled with "Enhancement" are not a requirement
>             for 2.0
>          - Endpoints:
>             - Task Instance Endpoint
>             <https://github.com/apache/airflow/pull/9597> is WIP, all the
>             other endpoints have been implemented.
>          - Permissions Model:
>             - On-going discussion on the PR
>             <https://github.com/apache/airflow/pull/10594> but close to
>             completion.
>             - The next piece of work to be done is migrating existing Views
>             to use resource-based permissions. (Github issue
>             <https://github.com/apache/airflow/issues/10469>). This is
>             mainly for standardizing the permissions model across API and UI.
>          - *Improvements to SubDags / Concept of TaskGroup*
>       - AIP-34
>       <https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-34+TaskGroup%3A+A+UI+task+grouping+concept+as+an+alternative+to+SubDagOperator>
>        | PR <https://github.com/apache/airflow/pull/10153> introduced the
>       concepts of TaskGroup and will be *included in Airflow 2.0*.
>          - The PR implements TaskGroups for Graph View, the Tree View will
>          be implemented in follow-up PRs.
>       - Follow-up items from the discussion:
>          - Discuss on mailing list whether we should deprecate SubDags in
>          favour of TaskGroup in 2.0 or wait until Airflow 2.1 or 2.2
>          - Add docs around when to use TaskGroup vs SubDag and potentially
>          listing PROs and CONS.
>       - *Scheduler HA *(AIP-15
>    <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103092651>
>     )
>       - A Draft PR <https://github.com/apache/airflow/pull/10956> has been
>       created to enable code reviews and to allow the members of the
> community to
>       start testing it with various setups.
>       - To get the most benefit of Scheduler HA on MySQL, users will need
>       to use MySQL 8. This is because MySQL 5.7 does not support SKIP LOCK
>       <https://dev.mysql.com/doc/refman/8.0/en/innodb-locking-reads.html#innodb-locking-reads-nowait-skip-locked>feature
>       but note that *MySQL 5.7 will still continue to work with at least
>       the same or improved performance as now*.
>       - Astronomer has done performance testing with different Scenarios
>       and will publish benchmarks over the coming weeks. Google Composer Team +
>       Polidea said that they would be happy to carry out various tests for
>       Scheduler HA as well.
>       - There were some concerns raised around LOCKING Timeout periods and
>       the usage of DAG Serialization. More testing in the upcoming weeks should
>       help mitigate any concerns and help fix the bugs if discovered.
>       - *Docs:*
>          - Explicitly mention that for HA Scheduler reads some of the
>          properties from serialized_dag table. Users can turn on/off DAG
>          Serialization in the Webserver but the Scheduler will
> continue using it.
>          - Do we recommend 2 schedulers for Production deployments?
>          - X Schedulers vs single Scheduler. Use case when one would be
>          better than the other.
>             - Some kind of Bell Curve showing an increase in Schedulers
>             stops improving performance and maybe also degrades. This
> is intended to
>             give guidance around what number of schedulers to run
> based on expected
>             load, since this decision could be based on multiple factors.
>          - Follow up items:
>          - Create mailing list thread to discuss "Removing Pickling from
>          Airflow 2.0". Currently, pickled dags are only supported by
> CeleryExecutor
>          and we have a flag on *airflow scheduler
>          <https://airflow.readthedocs.io/en/latest/cli-ref.html#scheduler>
> *(--do-pickle)
>          and "--ship-dag" on *airflow tasks run
>          <https://airflow.readthedocs.io/en/latest/cli-ref.html#run> *command.
>          If we want to remove pickling Airflow 2.0 is the right time
> or we shouldn't
>          do it until 3.0
>       - *Helm Chart*
>       - We will continue focusing on getting Airflow 2.0 out so the first
>       official release of Helm Chart might need to wait.
>       - The issue with Helm Chart sources was fixed and there are no
>       blockers currently if we were to release it at some point in the near
>       future.
>       - Enhancements (but not blockers) are:
>          - Better Test Coverage with integration tests
>          - Docs pointing to the chart on the Airflow Website or the docsite
>       - The artifacts for the Helm chart would be published at
>       https://downloads.apache.org/airflow/
>       - There is still an open question around *Helm Chart Versioning
>       Policy *i.e. do we want to tie-in Airflow Versions with Helm Chart?
>       Or do we just start from *1.0.0? * This needs to be decided before
>       the release of the Helm Chart.
>
>
>
> *Things to Discuss Next*
>
>    - *21 September (Subject to Change)*
>       - Finish up open discussion items from the earlier meeting if not yet
>       resolved:
>          - Providers versioning,
>          - SubDag deprecation,
>          - Helm Chart release,
>          - REST API permissions
>          - Docs changes
>       - UI Changes for 2.0
>          - Minimum effort changes: CSS/colours/spacing to make the UI look
>          a bit modern
>       - Process:
>          - When should we defer the in-scope items to post-2.0
>             - Completion by a date?
>             - Progress by a date?
>
>
> Regards,
> Kaxil



-- 

Jarek Potiuk
Polidea | Principal Software Engineer

M: +48 660 796 129