You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/11/09 16:09:22 UTC

[GitHub] [airflow] petedejoy commented on a change in pull request #12203: Adds provider package documentation in installation.rst

petedejoy commented on a change in pull request #12203:
URL: https://github.com/apache/airflow/pull/12203#discussion_r519922920



##########
File path: docs/installation.rst
##########
@@ -289,6 +299,100 @@ Here's the list of the subpackages and what they enable:
 | winrm               | ``pip install 'apache-airflow[microsoft.winrm]'``   | WinRM hooks and operators                                            |
 +---------------------+-----------------------------------------------------+----------------------------------------------------------------------+
 
+Provider packages
+.................
+
+
+Provider packages context
+-------------------------
+
+Unlike Apache Airflow 1.10, the Airflow 2.0 is delivered in multiple, separate, but connected packages.
+The core of Airflow scheduling system is delivered as ``apache-airflow`` package and there are around
+60 providers packages which can be installed separately as so called "Airflow Provider packages".
+Those provider packages are separated per-provider (for example ``amazon``, ``google``, ``salesforce``
+etc.)  Those packages are available as ``apache-airflow-providers`` packages - separately per ech provider
+(for example there is an ``apache-airflow-providers-amazon`` or ``apache-airflow-providers-google`` package.
+
+You can install those provider packages separately in order to interface with a given provider. For those
+providers that have corresponding extras, the provider packages (latest version from PyPI) are installed
+automatically when Airflow is installed with the extra.
+
+Providers are released and versioned separately from the Airflow releases. We are following the
+`Semver <https://semver.org/>`_ versioning scheme for the packages. Some versions of the provider
+packages might depend on particular versions of Airflow, but the general approach we have is that unless
+there is a good reason, new version of providers should work with recent versions of Airflow 2.x. Details
+will vary per-provider and if there is a limitation for particular version of particular provider,
+constraining the Airflow version used, it will be included as limitation of dependencies in the provider
+package.
+
+Some of the providers have cross-provider dependencies as well. Those are not required dependencies, they
+might simply enabled certain features (for example transfer operators often create dependency between
+different providers. Again, the general approach her is that the providers are backwards compatible,
+including cross-dependencies. Any kind of breaking changes and requirements on particular versions of other
+provider packages are automatically documented in the release notes of every provider.
+
+.. note::
+    We also provide ``apache-airflow-backport-providers`` packages that can be installed for Airflow 1.10.
+    Those are the same providers as for 2.0 but automatically back-ported to work for Airflow 1.10. Those
+    backport providers are going to be updated and released for 3 months after Apache Airflow 2.0 release.
+
+Provider packages functionality
+-------------------------------
+
+Separate provider packages provide the possibilities that were not available in 1.10:
+
+1. You can upgrade to latest version of particular providers without the need of Apache Airflow core upgrade
+
+2. You can downgrade to previous version of particular provider in case the new version introduces
+   some problems, without impacting the main Apache Airflow core package.
+
+3. You can release and upgrade/downgrade provider packages incrementally, independent from each other. This
+   means that you can incrementally validate each of the provider package update in your environment,
+   following the usual tests you have in your environment.
+
+
+Q&A for provider packages
+-------------------------
+
+Q. When upgrading to a new Airflow version such as 2.0, but possibly 2.0.1 and beyond, is the best practice
+   to also upgrade provider packages at the same time?
+
+A. It depends on your use case. If you have automated or semi-automated verification of your installation,
+   that you can run a new version of Airflow including all provider packages, then definitely go for it.
+   If you rely more on manual testing, it is advised that you upgrade in stages. Depending on your choice
+   you can either upgrade all used provider packages first, and then upgrade Airflow Core or the other way
+   round. The first approach - when you first upgrade all providers is probably safer, as you can do it
+   incrementally, step-by-step replacing provider by provider in your environment.
+
+
+Q. I have an Airflow version (1.10.12) running and it is stable. However, because of a Cloud provider change,
+   I would like to upgrade the provider package. If I don't need to upgrade the Airflow version anymore,
+   how do I know that this provider version is compatible with my Airflow version?
+
+
+A. Backport Provider Packages (those are needed in 1.10.* Airflow series) are going to be released for
+   3 months after the release. We will stop releasing new updates to the backport providers afterwards.
+   You will ba able to continue using the provider packages that you already use and unless you need to

Review comment:
       Small typo here- `be` :) 

##########
File path: docs/installation.rst
##########
@@ -289,6 +299,100 @@ Here's the list of the subpackages and what they enable:
 | winrm               | ``pip install 'apache-airflow[microsoft.winrm]'``   | WinRM hooks and operators                                            |
 +---------------------+-----------------------------------------------------+----------------------------------------------------------------------+
 
+Provider packages
+.................
+
+
+Provider packages context
+-------------------------
+
+Unlike Apache Airflow 1.10, the Airflow 2.0 is delivered in multiple, separate, but connected packages.
+The core of Airflow scheduling system is delivered as ``apache-airflow`` package and there are around
+60 providers packages which can be installed separately as so called "Airflow Provider packages".
+Those provider packages are separated per-provider (for example ``amazon``, ``google``, ``salesforce``
+etc.)  Those packages are available as ``apache-airflow-providers`` packages - separately per ech provider
+(for example there is an ``apache-airflow-providers-amazon`` or ``apache-airflow-providers-google`` package.
+
+You can install those provider packages separately in order to interface with a given provider. For those
+providers that have corresponding extras, the provider packages (latest version from PyPI) are installed
+automatically when Airflow is installed with the extra.
+
+Providers are released and versioned separately from the Airflow releases. We are following the
+`Semver <https://semver.org/>`_ versioning scheme for the packages. Some versions of the provider
+packages might depend on particular versions of Airflow, but the general approach we have is that unless
+there is a good reason, new version of providers should work with recent versions of Airflow 2.x. Details
+will vary per-provider and if there is a limitation for particular version of particular provider,
+constraining the Airflow version used, it will be included as limitation of dependencies in the provider
+package.
+
+Some of the providers have cross-provider dependencies as well. Those are not required dependencies, they
+might simply enabled certain features (for example transfer operators often create dependency between
+different providers. Again, the general approach her is that the providers are backwards compatible,
+including cross-dependencies. Any kind of breaking changes and requirements on particular versions of other
+provider packages are automatically documented in the release notes of every provider.
+
+.. note::
+    We also provide ``apache-airflow-backport-providers`` packages that can be installed for Airflow 1.10.
+    Those are the same providers as for 2.0 but automatically back-ported to work for Airflow 1.10. Those
+    backport providers are going to be updated and released for 3 months after Apache Airflow 2.0 release.
+
+Provider packages functionality
+-------------------------------
+
+Separate provider packages provide the possibilities that were not available in 1.10:
+
+1. You can upgrade to latest version of particular providers without the need of Apache Airflow core upgrade

Review comment:
       Missing a period here.

##########
File path: docs/installation.rst
##########
@@ -289,6 +299,100 @@ Here's the list of the subpackages and what they enable:
 | winrm               | ``pip install 'apache-airflow[microsoft.winrm]'``   | WinRM hooks and operators                                            |
 +---------------------+-----------------------------------------------------+----------------------------------------------------------------------+
 
+Provider packages
+.................
+
+
+Provider packages context
+-------------------------
+
+Unlike Apache Airflow 1.10, the Airflow 2.0 is delivered in multiple, separate, but connected packages.
+The core of Airflow scheduling system is delivered as ``apache-airflow`` package and there are around
+60 providers packages which can be installed separately as so called "Airflow Provider packages".
+Those provider packages are separated per-provider (for example ``amazon``, ``google``, ``salesforce``
+etc.)  Those packages are available as ``apache-airflow-providers`` packages - separately per ech provider
+(for example there is an ``apache-airflow-providers-amazon`` or ``apache-airflow-providers-google`` package.
+
+You can install those provider packages separately in order to interface with a given provider. For those
+providers that have corresponding extras, the provider packages (latest version from PyPI) are installed
+automatically when Airflow is installed with the extra.
+
+Providers are released and versioned separately from the Airflow releases. We are following the
+`Semver <https://semver.org/>`_ versioning scheme for the packages. Some versions of the provider
+packages might depend on particular versions of Airflow, but the general approach we have is that unless
+there is a good reason, new version of providers should work with recent versions of Airflow 2.x. Details
+will vary per-provider and if there is a limitation for particular version of particular provider,
+constraining the Airflow version used, it will be included as limitation of dependencies in the provider
+package.
+
+Some of the providers have cross-provider dependencies as well. Those are not required dependencies, they
+might simply enabled certain features (for example transfer operators often create dependency between
+different providers. Again, the general approach her is that the providers are backwards compatible,
+including cross-dependencies. Any kind of breaking changes and requirements on particular versions of other
+provider packages are automatically documented in the release notes of every provider.
+
+.. note::
+    We also provide ``apache-airflow-backport-providers`` packages that can be installed for Airflow 1.10.
+    Those are the same providers as for 2.0 but automatically back-ported to work for Airflow 1.10. Those
+    backport providers are going to be updated and released for 3 months after Apache Airflow 2.0 release.
+
+Provider packages functionality
+-------------------------------
+
+Separate provider packages provide the possibilities that were not available in 1.10:
+
+1. You can upgrade to latest version of particular providers without the need of Apache Airflow core upgrade
+
+2. You can downgrade to previous version of particular provider in case the new version introduces
+   some problems, without impacting the main Apache Airflow core package.
+
+3. You can release and upgrade/downgrade provider packages incrementally, independent from each other. This
+   means that you can incrementally validate each of the provider package update in your environment,
+   following the usual tests you have in your environment.
+
+
+Q&A for provider packages
+-------------------------
+
+Q. When upgrading to a new Airflow version such as 2.0, but possibly 2.0.1 and beyond, is the best practice
+   to also upgrade provider packages at the same time?
+
+A. It depends on your use case. If you have automated or semi-automated verification of your installation,
+   that you can run a new version of Airflow including all provider packages, then definitely go for it.
+   If you rely more on manual testing, it is advised that you upgrade in stages. Depending on your choice
+   you can either upgrade all used provider packages first, and then upgrade Airflow Core or the other way
+   round. The first approach - when you first upgrade all providers is probably safer, as you can do it
+   incrementally, step-by-step replacing provider by provider in your environment.
+
+
+Q. I have an Airflow version (1.10.12) running and it is stable. However, because of a Cloud provider change,
+   I would like to upgrade the provider package. If I don't need to upgrade the Airflow version anymore,
+   how do I know that this provider version is compatible with my Airflow version?
+
+
+A. Backport Provider Packages (those are needed in 1.10.* Airflow series) are going to be released for
+   3 months after the release. We will stop releasing new updates to the backport providers afterwards.
+   You will ba able to continue using the provider packages that you already use and unless you need to
+   get some new release of the provider that is only released in master, there is no need to upgrade

Review comment:
       What would be a case in which this happens (ie. a provider is only released in master)? If the BaseOperator changes or something?

##########
File path: docs/installation.rst
##########
@@ -289,6 +299,100 @@ Here's the list of the subpackages and what they enable:
 | winrm               | ``pip install 'apache-airflow[microsoft.winrm]'``   | WinRM hooks and operators                                            |
 +---------------------+-----------------------------------------------------+----------------------------------------------------------------------+
 
+Provider packages
+.................
+
+
+Provider packages context

Review comment:
       My perception could be wrong, but a big benefit I see to this feature is how "dependency-light" it makes the Airflow base. Since all of these providers (which have their own sets of complex dependencies) are no longer included in the base, a user is less likely to run into one-off dependency conflicts when trying to install additional libraries. I wonder if that's worth calling out in the context here? Curious to hear your thoughts @potiuk.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org