You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Ash Berlin-Taylor <as...@apache.org> on 2020/10/13 19:46:48 UTC

Airflow 2.0.0.alpha1 snapshot ready for testing!

I'm proud to announce the availability of Apache Airlow 2.0.0.alpha1 for testing!

First the caveat: this is an alpha release. Do not run it in production, it might not be without serious problems, and in the extreme case you may have to reset your database between this and the beta or release candidates. (This is extremely unlikely, but don't say we didn't warn you.)
This "snapshot" is intended for members of the Airflow developer community to test the build and get an early start on testing 2.0.0. For clarity, this is not an official release of Apache Airflow either - that doesn't happen until we make a release candidate and then vote on that, and based on the expected timelines on the Airflow 2.0 planning page (https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+2.0+-+Planning), we expect that to happen the week of 30th Nov, 2020.
This is quite a big change, so for this alpha release you shouldn't necessarily expect your DAGs to work unchanged -- please read https://github.com/apache/airflow/blob/2.0.0a1/UPDATING.md#airflow-200a1 for updating notes. Before we release 2.0.0 fully we will have a 1.10.13 released that provides an automated tool to identify many of the changes that you will need to make before upgrading to 2.0.
The alpha snapshot is available at:
https://dist.apache.org/repos/dist/dev/airflow/2.0.0a1/
*apache-airflow-2.0.0a1-source.tar.gz* is a source release that comes with INSTALL instructions.
*apache-airflow-2.0.0a1-bin.tar.gz* is the binary Python "sdist" snapshot.
*apache_airflow-2.0.0a1-py3-none-any.whl* is the binary Python wheel snapshot.
This snapshot has not been pushed to PyPi.
Public keys are available at: https://www.apache.org/dist/airflow/KEYS
The full changelog is about 2,000 lines long (already excluding anything backported to 1.10), so for now there is no full change log yet, but the major features in 2.0.0alpha1 compared to 1.10.12 are:
Decorated Flows (AIP-31)
(Used to be called Functional DAGs.)
DAGs are now much much nicer to author especially when using PythonOperator, deps are handled more clearly and XCom is nicer to use
Read more here:
Decorated Flow Documentation (https://airflow.readthedocs.io/en/latest/concepts.html#decorated-flows)
Fully specified REST API (AIP-32)
We now have a fully supported, and no-longer-experimental API with a fully published OpenAPI specification.
Read more here:
REST API Documentation (https://airflow.readthedocs.io/en/latest/stable-rest-api-ref.html)
Massive Scheduler performance improvements
As part of AIP-15 (Scheduler HA+performance) and other work Kamil did we have made significant performance improvements to the Airflow Scheduler and it now starts tasks much, MUCH quicker.
We will follow up with exact benchmark figures (we want to triple check them as we don't quite believe the numbers!)
Scheduler is now HA compatible (AIP-15)
It's now possible and supported to run more than a single scheduler instance, either for resiliency in case one goes down, or to get higher scheduling performance.
To fully use this feature you need Postgres 9.6+ or MySQL 8+ (MySQL 5 won't work with more than one scheduler I'm afraid).
There's no config or other set up required to run more than one scheduler—just start up a second scheduler somewhere else (ensuring it has access to the DAG files) and they will all cooperate through the database.
Docs PR here: Scheduler HA documentation PR (https://github.com/apache/airflow/pull/11467/files)
Task Groups (AIP-34, Docs)
SubDAGs are useful for grouping tasks in the UI but have many drawbacks in their execution behaviour (such as only executing a single task in parallel!) so we've introduced a new concept called "Task Groups" which provide the same grouping behaviour as subdags, but don't have any of the execution-time drawbacks.
Read more here: Task Grouping Documentation (https://airflow.readthedocs.io/en/latest/concepts.html#taskgroup)
Refreshed UI
We've given the Airflow UI a visual refresh (https://github.com/apache/airflow/pull/11195) and updated some of the styling. Check out the screenshots in the docs (https://airflow.readthedocs.io/en/latest/ui.html).
Smart Sensors for reduced load from sensors (AIP-17)
If you make heavy use of sensors in your Airflow cluster you can start to find that sensor execution starts to take up a significant proportion of your cluster, even with "reshedule" mode. So we've added a new mode called "Smart Sensors.
This feature is in "early-access" - it's been well tested by AirBnB, so is "stable"/usable but we reserve the right to make backwards incompatible changes in a future release (if we have to. We'll try very hard not to!)
Docs on: Smart Sensors (https://airflow.readthedocs.io/en/latest/smart-sensor.html?highlight=smartsensors)
Simplified KubernetesExecutor
For Airflow 2.0, we have re-architected the KubernetesExecutor in a fashion that is simultaneously faster, simpler to understand, and offers far more flexibility to Airflow users. Users will now be able to access the full Kubernetes API to create a yaml `pod_template_file` instead of filling in parameters in their airflow.cfg.
We have also replaced the `executor_config` dictionary with the `pod_override` parameter, which takes a Kubernetes V1Pod object for a clear 1:1 override setting. These changes have removed over three thousand lines of code for the KubernetesExecutor, which simultaneously makes it run faster and creates fewer potential errors.
Read more here:
Docs on pod_template_file (https://airflow.readthedocs.io/en/latest/executor/kubernetes.html?highlight=pod_override#pod-template-file)
Docs on pod_override (https://airflow.readthedocs.io/en/latest/executor/kubernetes.html?highlight=pod_override#pod-override)

We've tried where possible to make as few breaking changes as possible, and to provide deprecation path in the code, especially in the case of anything called in the DAG, but please read through the UPDATING.md to check what might affect you - for instance we have re-organized the layout of operators (they now all live under airflow.providers.*) but the old names should continue to work, you'll just notice a lot of DeprecationWarnings that you should fix up.

Thank you so much to all the contributors over to get us to this point, in no particular order: Kaxil Naik, Daniel Imberman, Jarek Potiuk, Tomek Urbaszek, Kamil Breguła, Gerard Casas Saez, Kevin Yang, James Timmins, Yingbo Wang, Qian Yu, Ryan Hamilton and the 100s of others who keep making Airflow better for everyone.

Re: Airflow 2.0.0.alpha1 snapshot ready for testing!

Posted by Jarek Potiuk <Ja...@polidea.com>.
Few points to add from my side to discuss today - to plan the work for the
upcoming week:

* For 1st Beta I think it will be important to release PyPI packages I
believe
* I think the automated providers registering might be partially ready as
well for the beta, probably not all of the features - I plan them for beta2
* I think even if this is not yet needed for betas, we need to discuss how
to approach per-provider documentation separation. It would be great to
fully separate - especially the AutoAPI building per-provider (Also on the
CI).

J.


On Sun, Oct 18, 2020 at 10:58 PM Jarek Potiuk <Ja...@polidea.com>
wrote:

> The usual pre-meeting summary from my side.
>
> In preparation for Airflow 2.0 Dev call tomorrow, I have prepared some bug
> fixes and improvements to the providers approach. We are steadily moving in
> the mini-project https://github.com/apache/airflow/projects/5 . Some of
> them already
> merged (thanks those who reviewed), but I have 2 PRs in progress to get to
> the place where it is ready to release alpha2
>
> https://github.com/apache/airflow/pull/11630 - The .tar.gz provider
> packages are installable now.
> https://github.com/apache/airflow/pull/11586 - Fixes versioning for
> pre-release provider packages
>
> The small thing that I've stumbled upon while preparing the provider
> package was the Licences/Notice review -
> https://github.com/apache/airflow/issues/11632 as Ash mentioned in the
> comment, the current "dupli/tripli-cation of reporting the licences" is
> likely OK (as we've done that at incubator graduation). Ry Walker
> already had some comments / results of licence checks recently so maybe we
> can talk about it and agree some common approach (or agree that there is
> nothing to discuss) tomorrow,
>
> For the provider's packages, I reviewed and removed all the license deps
> as not needed, but it would be great to talk about it tomorrow anyway.
>
> J.
>
>
> On Wed, Oct 14, 2020 at 9:00 AM Jarek Potiuk <Ja...@polidea.com>
> wrote:
>
>> A small follow up: The 2.0.0a1 release is the "core" release only. It has
>> no "providers" installed. Airflow 2.0 will be distributed as a number of
>> separate packages: "core" will be released separately and each of the
>> providers has its own package to install.
>> Once we release it in PyPI, the right provider packages will be installed
>> automatically when you install the right extra (so pip install
>> apache-airflow[google] will also pull in the latest
>> apache-airflow-providers-google package, but for now you need to install
>> those packages manually.
>> The 0.0.1a versions of all provider packages are available at
>> https://dist.apache.org/repos/dist/dev/airflow/providers/0.0.1a1/
>>
>> And big congrats to the whole team for pulling this together! That is a
>> huge milestone!
>>
>> J.
>>
>>
>> On Tue, Oct 13, 2020 at 9:47 PM Ash Berlin-Taylor <as...@firemirror.com>
>> wrote:
>>
>>> I'm proud to announce the availability of Apache Airlow 2.0.0.alpha1
>>> for testing!
>>>
>>> First the caveat: this is an alpha release. Do not run it in production,
>>> it might not be without serious problems, and in the extreme case you may
>>> have to reset your database between this and the beta or release
>>> candidates. (This is extremely unlikely, but don't say we didn't warn you.)
>>>
>>> This "snapshot" is intended for members of the Airflow developer
>>> community to test the build and get an early start on testing 2.0.0. For
>>> clarity, this is not an official release of Apache Airflow either - that
>>> doesn't happen until we make a release candidate and then vote on it, and
>>> based on the expected timelines on the Airflow 2.0 planning page
>>> <https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+2.0+-+Planning>,
>>> we expect that to happen the week of 30th Nov, 2020.
>>>
>>> This is quite a big change, so for this alpha release you shouldn't
>>> necessarily expect your DAGs to work unchanged -- please read
>>> https://github.com/apache/airflow/blob/2.0.0a1/UPDATING.md#airflow-200a1 for
>>> updating notes. Before we release 2.0.0 fully we will have a 1.10.13
>>> released that provides an automated tool to identify many of the changes
>>> that you will need to make before upgrading to 2.0
>>>
>>> The alpha snapshot is available at:
>>>
>>> https://dist.apache.org/repos/dist/dev/airflow/2.0.0a1/
>>>
>>> *apache-airflow-2.0.0a1-source.tar.gz* is a source release that comes
>>> with INSTALL instructions.
>>>
>>> *apache-airflow-2.0.0a1-bin.tar.gz* is the binary Python "sdist"
>>> snapshot.
>>>
>>> *apache_airflow-2.0.0a1-py3-none-any.whl* is the binary Python wheel
>>> snapshot.
>>>
>>> This snapshot has *not* been pushed to PyPi.
>>>
>>> Public keys are available at: https://www.apache.org/dist/airflow/KEYS
>>>
>>> The full changelog is about 2,000 lines long (already excluding anything
>>> backported to 1.10), so for now there is no full change log *yet*, but
>>> the major features in 2.0.0alpha1 compared to 1.10.12 are:
>>>
>>>
>>>    - Decorated Flows (AIP-31)
>>>
>>>    (Used to be called Functional DAGs.)
>>>
>>>    DAGs are now much much nicer to author especially when using
>>>    PythonOperator, deps are handled more clearly and XCom is nicer to use
>>>
>>>    Read more here:
>>>
>>>    Decorated Flow Documentation
>>>    <https://airflow.readthedocs.io/en/latest/concepts.html#decorated-flows>
>>>
>>>    - Fully specified REST API (AIP-32)
>>>
>>>    We now have a fully supported, and no-longer-experimental API with a
>>>    fully published OpenAPI specification.
>>>
>>>    Read more here:
>>>
>>>    REST API Documentation
>>>    <https://airflow.readthedocs.io/en/latest/stable-rest-api-ref.html>
>>>
>>>    - Massive Scheduler performance improvements
>>>
>>>    As part of AIP-15 (Scheduler HA+performance) and other work Kamil
>>>    did we have made significant performance improvements to the Airflow
>>>    Scheduler and it now starts tasks much, MUCH quicker.
>>>
>>>    We will follow up with exact benchmark figures (we want to triple
>>>    check them as we don't quite believe the numbers!)
>>>
>>>    - Scheduler is now HA compatible (AIP-15)
>>>
>>>    It's now possible and supported to run more than a single scheduler
>>>    instance, either for resiliency in case one goes down, or to get higher
>>>    scheduling performance.
>>>
>>>    To fully use this feature you need Postgres 9.6+ or MySQL 8+ (MySQL
>>>    5 won't work with more than one scheduler I'm afraid).
>>>
>>>    There's no config or other set up required to run more than one
>>>    scheduler—just start up a second scheduler somewhere else (ensuring it has
>>>    access to the DAG files) and they will all cooperate through the database.
>>>
>>>    Docs PR here: Scheduler HA documentation PR
>>>    <https://github.com/apache/airflow/pull/11467/files>
>>>
>>>    - Task Groups (AIP-34, Docs)
>>>
>>>    SubDAGs are useful for grouping tasks in the UI but have many
>>>    drawbacks in their execution behaviour (such as only executing a single
>>>    task in parallel!) so we've introduced a new concept called "Task Groups"
>>>    which provide the same grouping behaviour as subdags, but don't have any of
>>>    the execution-time drawbacks.
>>>
>>>    Read more here: Task Grouping Documentation
>>>    <https://airflow.readthedocs.io/en/latest/concepts.html#taskgroup>
>>>
>>>    - Refreshed UI
>>>
>>>    We've given the Airflow UI a visual refresh
>>>    <https://github.com/apache/airflow/pull/11195> and updated some of
>>>    the styling. Check out the screenshots in the docs
>>>    <https://airflow.readthedocs.io/en/latest/ui.html>.
>>>
>>>    - Smart Sensors for reduced load from sensors (AIP-17)
>>>
>>>    If you make heavy use of sensors in your Airflow cluster you can
>>>    start to find that sensor execution starts to take up a significant
>>>    proportion of your cluster, even with "reshedule" mode. So we've added a
>>>    new mode called "Smart Sensors.
>>>
>>>    This feature is in "early-access" - it's been well tested by AirBnB,
>>>    so is "stable"/usable but we reserve the right to make backwards
>>>    incompatible changes in a future release (if we have to. We'll try very
>>>    hard not to!)
>>>
>>>    Docs on: Smart Sensors
>>>    <https://airflow.readthedocs.io/en/latest/smart-sensor.html?highlight=smartsensors>
>>>
>>>    - Simplified KubernetesExecutor
>>>
>>>    For Airflow 2.0, we have re-architected the KubernetesExecutor in a
>>>    fashion that is simultaneously faster, simpler to understand, and offers
>>>    far more flexibility to Airflow users. Users will now be able to access the
>>>    full Kubernetes API to create a yaml `pod_template_file` instead of filling
>>>    in parameters in their airflow.cfg.
>>>
>>>    We have also replaced the `executor_config` dictionary with the
>>>    `pod_override` parameter, which takes a Kubernetes V1Pod object for a clear
>>>    1:1 override setting. These changes have removed over three thousand lines
>>>    of code for the KubernetesExecutor, which simultaneously makes it run
>>>    faster and creates fewer potential errors.
>>>
>>>    Read more here:
>>>
>>>    Docs on pod_template_file
>>>    <https://airflow.readthedocs.io/en/latest/executor/kubernetes.html?highlight=pod_override#pod-template-file>
>>>    Docs on pod_override
>>>    <https://airflow.readthedocs.io/en/latest/executor/kubernetes.html?highlight=pod_override#pod-override>
>>>
>>> We've tried where possible to make as few breaking changes as possible,
>>> and to provide deprecation path in the code, especially in the case of
>>> anything called in the DAG, but please read through the UPDATING.md to
>>> check what might affect you - for instance we have re-organized the layout
>>> of operators (they now all live under airflow.providers.*) but the old
>>> names should continue to work, you'll just notice a lot of
>>> DeprecationWarnings that you should fix up.
>>>
>>> Thank you so much to all the contributors over to get us to this point,
>>> in no particular order: Kaxil Naik, Daniel Imberman, Jarek Potiuk, Tomek
>>> Urbaszek, Kamil Breguła, Gerard Casas Saez, Kevin Yang, James Timmins,
>>> Yingbo Wang, Qian Yu, Ryan Hamilton and the 100s of others who keep making
>>> Airflow better for everyone.
>>>
>>
>>
>> --
>>
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>
>> M: +48 660 796 129 <+48660796129>
>> [image: Polidea] <https://www.polidea.com/>
>>
>>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: Airflow 2.0.0.alpha1 snapshot ready for testing!

Posted by Jarek Potiuk <Ja...@polidea.com>.
The usual pre-meeting summary from my side.

In preparation for Airflow 2.0 Dev call tomorrow, I have prepared some bug
fixes and improvements to the providers approach. We are steadily moving in
the mini-project https://github.com/apache/airflow/projects/5 . Some of
them already
merged (thanks those who reviewed), but I have 2 PRs in progress to get to
the place where it is ready to release alpha2

https://github.com/apache/airflow/pull/11630 - The .tar.gz provider
packages are installable now.
https://github.com/apache/airflow/pull/11586 - Fixes versioning for
pre-release provider packages

The small thing that I've stumbled upon while preparing the provider
package was the Licences/Notice review -
https://github.com/apache/airflow/issues/11632 as Ash mentioned in the
comment, the current "dupli/tripli-cation of reporting the licences" is
likely OK (as we've done that at incubator graduation). Ry Walker
already had some comments / results of licence checks recently so maybe we
can talk about it and agree some common approach (or agree that there is
nothing to discuss) tomorrow,

For the provider's packages, I reviewed and removed all the license deps as
not needed, but it would be great to talk about it tomorrow anyway.

J.


On Wed, Oct 14, 2020 at 9:00 AM Jarek Potiuk <Ja...@polidea.com>
wrote:

> A small follow up: The 2.0.0a1 release is the "core" release only. It has
> no "providers" installed. Airflow 2.0 will be distributed as a number of
> separate packages: "core" will be released separately and each of the
> providers has its own package to install.
> Once we release it in PyPI, the right provider packages will be installed
> automatically when you install the right extra (so pip install
> apache-airflow[google] will also pull in the latest
> apache-airflow-providers-google package, but for now you need to install
> those packages manually.
> The 0.0.1a versions of all provider packages are available at
> https://dist.apache.org/repos/dist/dev/airflow/providers/0.0.1a1/
>
> And big congrats to the whole team for pulling this together! That is a
> huge milestone!
>
> J.
>
>
> On Tue, Oct 13, 2020 at 9:47 PM Ash Berlin-Taylor <as...@firemirror.com>
> wrote:
>
>> I'm proud to announce the availability of Apache Airlow 2.0.0.alpha1 for
>> testing!
>>
>> First the caveat: this is an alpha release. Do not run it in production,
>> it might not be without serious problems, and in the extreme case you may
>> have to reset your database between this and the beta or release
>> candidates. (This is extremely unlikely, but don't say we didn't warn you.)
>>
>> This "snapshot" is intended for members of the Airflow developer
>> community to test the build and get an early start on testing 2.0.0. For
>> clarity, this is not an official release of Apache Airflow either - that
>> doesn't happen until we make a release candidate and then vote on it, and
>> based on the expected timelines on the Airflow 2.0 planning page
>> <https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+2.0+-+Planning>,
>> we expect that to happen the week of 30th Nov, 2020.
>>
>> This is quite a big change, so for this alpha release you shouldn't
>> necessarily expect your DAGs to work unchanged -- please read
>> https://github.com/apache/airflow/blob/2.0.0a1/UPDATING.md#airflow-200a1 for
>> updating notes. Before we release 2.0.0 fully we will have a 1.10.13
>> released that provides an automated tool to identify many of the changes
>> that you will need to make before upgrading to 2.0
>>
>> The alpha snapshot is available at:
>>
>> https://dist.apache.org/repos/dist/dev/airflow/2.0.0a1/
>>
>> *apache-airflow-2.0.0a1-source.tar.gz* is a source release that comes
>> with INSTALL instructions.
>>
>> *apache-airflow-2.0.0a1-bin.tar.gz* is the binary Python "sdist" snapshot.
>>
>> *apache_airflow-2.0.0a1-py3-none-any.whl* is the binary Python wheel
>> snapshot.
>>
>> This snapshot has *not* been pushed to PyPi.
>>
>> Public keys are available at: https://www.apache.org/dist/airflow/KEYS
>>
>> The full changelog is about 2,000 lines long (already excluding anything
>> backported to 1.10), so for now there is no full change log *yet*, but
>> the major features in 2.0.0alpha1 compared to 1.10.12 are:
>>
>>
>>    - Decorated Flows (AIP-31)
>>
>>    (Used to be called Functional DAGs.)
>>
>>    DAGs are now much much nicer to author especially when using
>>    PythonOperator, deps are handled more clearly and XCom is nicer to use
>>
>>    Read more here:
>>
>>    Decorated Flow Documentation
>>    <https://airflow.readthedocs.io/en/latest/concepts.html#decorated-flows>
>>
>>    - Fully specified REST API (AIP-32)
>>
>>    We now have a fully supported, and no-longer-experimental API with a
>>    fully published OpenAPI specification.
>>
>>    Read more here:
>>
>>    REST API Documentation
>>    <https://airflow.readthedocs.io/en/latest/stable-rest-api-ref.html>
>>
>>    - Massive Scheduler performance improvements
>>
>>    As part of AIP-15 (Scheduler HA+performance) and other work Kamil did
>>    we have made significant performance improvements to the Airflow Scheduler
>>    and it now starts tasks much, MUCH quicker.
>>
>>    We will follow up with exact benchmark figures (we want to triple
>>    check them as we don't quite believe the numbers!)
>>
>>    - Scheduler is now HA compatible (AIP-15)
>>
>>    It's now possible and supported to run more than a single scheduler
>>    instance, either for resiliency in case one goes down, or to get higher
>>    scheduling performance.
>>
>>    To fully use this feature you need Postgres 9.6+ or MySQL 8+ (MySQL 5
>>    won't work with more than one scheduler I'm afraid).
>>
>>    There's no config or other set up required to run more than one
>>    scheduler—just start up a second scheduler somewhere else (ensuring it has
>>    access to the DAG files) and they will all cooperate through the database.
>>
>>    Docs PR here: Scheduler HA documentation PR
>>    <https://github.com/apache/airflow/pull/11467/files>
>>
>>    - Task Groups (AIP-34, Docs)
>>
>>    SubDAGs are useful for grouping tasks in the UI but have many
>>    drawbacks in their execution behaviour (such as only executing a single
>>    task in parallel!) so we've introduced a new concept called "Task Groups"
>>    which provide the same grouping behaviour as subdags, but don't have any of
>>    the execution-time drawbacks.
>>
>>    Read more here: Task Grouping Documentation
>>    <https://airflow.readthedocs.io/en/latest/concepts.html#taskgroup>
>>
>>    - Refreshed UI
>>
>>    We've given the Airflow UI a visual refresh
>>    <https://github.com/apache/airflow/pull/11195> and updated some of
>>    the styling. Check out the screenshots in the docs
>>    <https://airflow.readthedocs.io/en/latest/ui.html>.
>>
>>    - Smart Sensors for reduced load from sensors (AIP-17)
>>
>>    If you make heavy use of sensors in your Airflow cluster you can
>>    start to find that sensor execution starts to take up a significant
>>    proportion of your cluster, even with "reshedule" mode. So we've added a
>>    new mode called "Smart Sensors.
>>
>>    This feature is in "early-access" - it's been well tested by AirBnB,
>>    so is "stable"/usable but we reserve the right to make backwards
>>    incompatible changes in a future release (if we have to. We'll try very
>>    hard not to!)
>>
>>    Docs on: Smart Sensors
>>    <https://airflow.readthedocs.io/en/latest/smart-sensor.html?highlight=smartsensors>
>>
>>    - Simplified KubernetesExecutor
>>
>>    For Airflow 2.0, we have re-architected the KubernetesExecutor in a
>>    fashion that is simultaneously faster, simpler to understand, and offers
>>    far more flexibility to Airflow users. Users will now be able to access the
>>    full Kubernetes API to create a yaml `pod_template_file` instead of filling
>>    in parameters in their airflow.cfg.
>>
>>    We have also replaced the `executor_config` dictionary with the
>>    `pod_override` parameter, which takes a Kubernetes V1Pod object for a clear
>>    1:1 override setting. These changes have removed over three thousand lines
>>    of code for the KubernetesExecutor, which simultaneously makes it run
>>    faster and creates fewer potential errors.
>>
>>    Read more here:
>>
>>    Docs on pod_template_file
>>    <https://airflow.readthedocs.io/en/latest/executor/kubernetes.html?highlight=pod_override#pod-template-file>
>>    Docs on pod_override
>>    <https://airflow.readthedocs.io/en/latest/executor/kubernetes.html?highlight=pod_override#pod-override>
>>
>> We've tried where possible to make as few breaking changes as possible,
>> and to provide deprecation path in the code, especially in the case of
>> anything called in the DAG, but please read through the UPDATING.md to
>> check what might affect you - for instance we have re-organized the layout
>> of operators (they now all live under airflow.providers.*) but the old
>> names should continue to work, you'll just notice a lot of
>> DeprecationWarnings that you should fix up.
>>
>> Thank you so much to all the contributors over to get us to this point,
>> in no particular order: Kaxil Naik, Daniel Imberman, Jarek Potiuk, Tomek
>> Urbaszek, Kamil Breguła, Gerard Casas Saez, Kevin Yang, James Timmins,
>> Yingbo Wang, Qian Yu, Ryan Hamilton and the 100s of others who keep making
>> Airflow better for everyone.
>>
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: Airflow 2.0.0.alpha1 snapshot ready for testing!

Posted by Jarek Potiuk <Ja...@polidea.com>.
A small follow up: The 2.0.0a1 release is the "core" release only. It has
no "providers" installed. Airflow 2.0 will be distributed as a number of
separate packages: "core" will be released separately and each of the
providers has its own package to install.
Once we release it in PyPI, the right provider packages will be installed
automatically when you install the right extra (so pip install
apache-airflow[google] will also pull in the latest
apache-airflow-providers-google package, but for now you need to install
those packages manually.
The 0.0.1a versions of all provider packages are available at
https://dist.apache.org/repos/dist/dev/airflow/providers/0.0.1a1/

And big congrats to the whole team for pulling this together! That is a
huge milestone!

J.


On Tue, Oct 13, 2020 at 9:47 PM Ash Berlin-Taylor <as...@firemirror.com>
wrote:

> I'm proud to announce the availability of Apache Airlow 2.0.0.alpha1 for
> testing!
>
> First the caveat: this is an alpha release. Do not run it in production,
> it might not be without serious problems, and in the extreme case you may
> have to reset your database between this and the beta or release
> candidates. (This is extremely unlikely, but don't say we didn't warn you.)
>
> This "snapshot" is intended for members of the Airflow developer community
> to test the build and get an early start on testing 2.0.0. For clarity,
> this is not an official release of Apache Airflow either - that doesn't
> happen until we make a release candidate and then vote on it, and based on
> the expected timelines on the Airflow 2.0 planning page
> <https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+2.0+-+Planning>,
> we expect that to happen the week of 30th Nov, 2020.
>
> This is quite a big change, so for this alpha release you shouldn't
> necessarily expect your DAGs to work unchanged -- please read
> https://github.com/apache/airflow/blob/2.0.0a1/UPDATING.md#airflow-200a1 for
> updating notes. Before we release 2.0.0 fully we will have a 1.10.13
> released that provides an automated tool to identify many of the changes
> that you will need to make before upgrading to 2.0
>
> The alpha snapshot is available at:
>
> https://dist.apache.org/repos/dist/dev/airflow/2.0.0a1/
>
> *apache-airflow-2.0.0a1-source.tar.gz* is a source release that comes with
> INSTALL instructions.
>
> *apache-airflow-2.0.0a1-bin.tar.gz* is the binary Python "sdist" snapshot.
>
> *apache_airflow-2.0.0a1-py3-none-any.whl* is the binary Python wheel
> snapshot.
>
> This snapshot has *not* been pushed to PyPi.
>
> Public keys are available at: https://www.apache.org/dist/airflow/KEYS
>
> The full changelog is about 2,000 lines long (already excluding anything
> backported to 1.10), so for now there is no full change log *yet*, but
> the major features in 2.0.0alpha1 compared to 1.10.12 are:
>
>
>    - Decorated Flows (AIP-31)
>
>    (Used to be called Functional DAGs.)
>
>    DAGs are now much much nicer to author especially when using
>    PythonOperator, deps are handled more clearly and XCom is nicer to use
>
>    Read more here:
>
>    Decorated Flow Documentation
>    <https://airflow.readthedocs.io/en/latest/concepts.html#decorated-flows>
>
>    - Fully specified REST API (AIP-32)
>
>    We now have a fully supported, and no-longer-experimental API with a
>    fully published OpenAPI specification.
>
>    Read more here:
>
>    REST API Documentation
>    <https://airflow.readthedocs.io/en/latest/stable-rest-api-ref.html>
>
>    - Massive Scheduler performance improvements
>
>    As part of AIP-15 (Scheduler HA+performance) and other work Kamil did
>    we have made significant performance improvements to the Airflow Scheduler
>    and it now starts tasks much, MUCH quicker.
>
>    We will follow up with exact benchmark figures (we want to triple
>    check them as we don't quite believe the numbers!)
>
>    - Scheduler is now HA compatible (AIP-15)
>
>    It's now possible and supported to run more than a single scheduler
>    instance, either for resiliency in case one goes down, or to get higher
>    scheduling performance.
>
>    To fully use this feature you need Postgres 9.6+ or MySQL 8+ (MySQL 5
>    won't work with more than one scheduler I'm afraid).
>
>    There's no config or other set up required to run more than one
>    scheduler—just start up a second scheduler somewhere else (ensuring it has
>    access to the DAG files) and they will all cooperate through the database.
>
>    Docs PR here: Scheduler HA documentation PR
>    <https://github.com/apache/airflow/pull/11467/files>
>
>    - Task Groups (AIP-34, Docs)
>
>    SubDAGs are useful for grouping tasks in the UI but have many
>    drawbacks in their execution behaviour (such as only executing a single
>    task in parallel!) so we've introduced a new concept called "Task Groups"
>    which provide the same grouping behaviour as subdags, but don't have any of
>    the execution-time drawbacks.
>
>    Read more here: Task Grouping Documentation
>    <https://airflow.readthedocs.io/en/latest/concepts.html#taskgroup>
>
>    - Refreshed UI
>
>    We've given the Airflow UI a visual refresh
>    <https://github.com/apache/airflow/pull/11195> and updated some of the
>    styling. Check out the screenshots in the docs
>    <https://airflow.readthedocs.io/en/latest/ui.html>.
>
>    - Smart Sensors for reduced load from sensors (AIP-17)
>
>    If you make heavy use of sensors in your Airflow cluster you can start
>    to find that sensor execution starts to take up a significant proportion of
>    your cluster, even with "reshedule" mode. So we've added a new mode called
>    "Smart Sensors.
>
>    This feature is in "early-access" - it's been well tested by AirBnB,
>    so is "stable"/usable but we reserve the right to make backwards
>    incompatible changes in a future release (if we have to. We'll try very
>    hard not to!)
>
>    Docs on: Smart Sensors
>    <https://airflow.readthedocs.io/en/latest/smart-sensor.html?highlight=smartsensors>
>
>    - Simplified KubernetesExecutor
>
>    For Airflow 2.0, we have re-architected the KubernetesExecutor in a
>    fashion that is simultaneously faster, simpler to understand, and offers
>    far more flexibility to Airflow users. Users will now be able to access the
>    full Kubernetes API to create a yaml `pod_template_file` instead of filling
>    in parameters in their airflow.cfg.
>
>    We have also replaced the `executor_config` dictionary with the
>    `pod_override` parameter, which takes a Kubernetes V1Pod object for a clear
>    1:1 override setting. These changes have removed over three thousand lines
>    of code for the KubernetesExecutor, which simultaneously makes it run
>    faster and creates fewer potential errors.
>
>    Read more here:
>
>    Docs on pod_template_file
>    <https://airflow.readthedocs.io/en/latest/executor/kubernetes.html?highlight=pod_override#pod-template-file>
>    Docs on pod_override
>    <https://airflow.readthedocs.io/en/latest/executor/kubernetes.html?highlight=pod_override#pod-override>
>
> We've tried where possible to make as few breaking changes as possible,
> and to provide deprecation path in the code, especially in the case of
> anything called in the DAG, but please read through the UPDATING.md to
> check what might affect you - for instance we have re-organized the layout
> of operators (they now all live under airflow.providers.*) but the old
> names should continue to work, you'll just notice a lot of
> DeprecationWarnings that you should fix up.
>
> Thank you so much to all the contributors over to get us to this point, in
> no particular order: Kaxil Naik, Daniel Imberman, Jarek Potiuk, Tomek
> Urbaszek, Kamil Breguła, Gerard Casas Saez, Kevin Yang, James Timmins,
> Yingbo Wang, Qian Yu, Ryan Hamilton and the 100s of others who keep making
> Airflow better for everyone.
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>