You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/11/09 15:40:47 UTC

[GitHub] [airflow] potiuk opened a new pull request #12203: Adds provider package documentation in installation.rst

potiuk opened a new pull request #12203:
URL: https://github.com/apache/airflow/pull/12203


   Addresses initial version of #11880
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #12203: Adds provider package documentation in installation.rst

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #12203:
URL: https://github.com/apache/airflow/pull/12203#discussion_r520473163



##########
File path: docs/installation.rst
##########
@@ -289,6 +303,103 @@ Here's the list of the subpackages and what they enable:
 | winrm               | ``pip install 'apache-airflow[microsoft.winrm]'``   | WinRM hooks and operators                                            |
 +---------------------+-----------------------------------------------------+----------------------------------------------------------------------+
 
+Provider packages
+.................
+
+
+Provider packages context
+-------------------------
+
+Unlike Apache Airflow 1.10, the Airflow 2.0 is delivered in multiple, separate, but connected packages.
+The core of Airflow scheduling system is delivered as ``apache-airflow`` package and there are around
+60 providers packages which can be installed separately as so called "Airflow Provider packages".
+Those provider packages are separated per-provider (for example ``amazon``, ``google``, ``salesforce``
+etc.)  Those packages are available as ``apache-airflow-providers`` packages - separately per each provider
+(for example there is an ``apache-airflow-providers-amazon`` or ``apache-airflow-providers-google`` package.
+
+You can install those provider packages separately in order to interface with a given provider. For those
+providers that have corresponding extras, the provider packages (latest version from PyPI) are installed
+automatically when Airflow is installed with the extra.
+
+Providers are released and versioned separately from the Airflow releases. We are following the
+`Semver <https://semver.org/>`_ versioning scheme for the packages. Some versions of the provider
+packages might depend on particular versions of Airflow, but the general approach we have is that unless
+there is a good reason, new version of providers should work with recent versions of Airflow 2.x. Details
+will vary per-provider and if there is a limitation for particular version of particular provider,
+constraining the Airflow version used, it will be included as limitation of dependencies in the provider
+package.
+
+Some of the providers have cross-provider dependencies as well. Those are not required dependencies, they
+might simply enabled certain features (for example transfer operators often create dependency between
+different providers. Again, the general approach her is that the providers are backwards compatible,
+including cross-dependencies. Any kind of breaking changes and requirements on particular versions of other
+provider packages are automatically documented in the release notes of every provider.

Review comment:
       ```suggestion
   Some of the providers have cross-provider dependencies as well. Those are not required dependencies, they
   might simply enabled certain features (for example transfer operators often create dependency between
   different providers. Again, the general approach her is that the providers are backwards compatible,
   including cross-dependencies. Any kind of breaking changes and requirements on particular versions of other
   provider packages are automatically documented in the release notes of every provider.
   ```
   ```suggestion
   Some of the providers have cross-provider dependencies as well. Those are not required dependencies, they
   might simply enable certain features (for example transfer operators often create a dependency between
   different providers. Again, the general approach here is that the providers are backwards compatible,
   including cross-dependencies. Any kind of breaking changes and requirements on particular versions of other
   provider packages are automatically documented in the release notes of every provider.
   ```

##########
File path: docs/installation.rst
##########
@@ -289,6 +303,103 @@ Here's the list of the subpackages and what they enable:
 | winrm               | ``pip install 'apache-airflow[microsoft.winrm]'``   | WinRM hooks and operators                                            |
 +---------------------+-----------------------------------------------------+----------------------------------------------------------------------+
 
+Provider packages
+.................
+
+
+Provider packages context
+-------------------------
+
+Unlike Apache Airflow 1.10, the Airflow 2.0 is delivered in multiple, separate, but connected packages.
+The core of Airflow scheduling system is delivered as ``apache-airflow`` package and there are around
+60 providers packages which can be installed separately as so called "Airflow Provider packages".
+Those provider packages are separated per-provider (for example ``amazon``, ``google``, ``salesforce``
+etc.)  Those packages are available as ``apache-airflow-providers`` packages - separately per each provider
+(for example there is an ``apache-airflow-providers-amazon`` or ``apache-airflow-providers-google`` package.
+
+You can install those provider packages separately in order to interface with a given provider. For those
+providers that have corresponding extras, the provider packages (latest version from PyPI) are installed
+automatically when Airflow is installed with the extra.
+
+Providers are released and versioned separately from the Airflow releases. We are following the
+`Semver <https://semver.org/>`_ versioning scheme for the packages. Some versions of the provider
+packages might depend on particular versions of Airflow, but the general approach we have is that unless
+there is a good reason, new version of providers should work with recent versions of Airflow 2.x. Details
+will vary per-provider and if there is a limitation for particular version of particular provider,
+constraining the Airflow version used, it will be included as limitation of dependencies in the provider
+package.
+
+Some of the providers have cross-provider dependencies as well. Those are not required dependencies, they
+might simply enabled certain features (for example transfer operators often create dependency between
+different providers. Again, the general approach her is that the providers are backwards compatible,
+including cross-dependencies. Any kind of breaking changes and requirements on particular versions of other
+provider packages are automatically documented in the release notes of every provider.
+
+.. note::
+    We also provide ``apache-airflow-backport-providers`` packages that can be installed for Airflow 1.10.
+    Those are the same providers as for 2.0 but automatically back-ported to work for Airflow 1.10. Those
+    backport providers are going to be updated and released for 3 months after Apache Airflow 2.0 release.
+
+Provider packages functionality
+-------------------------------
+
+Separate provider packages provide the possibilities that were not available in 1.10:
+
+1. You can upgrade to latest version of particular providers without the need of Apache Airflow core upgrade.
+
+2. You can downgrade to previous version of particular provider in case the new version introduces
+   some problems, without impacting the main Apache Airflow core package.
+
+3. You can release and upgrade/downgrade provider packages incrementally, independent from each other. This
+   means that you can incrementally validate each of the provider package update in your environment,
+   following the usual tests you have in your environment.
+
+
+Q&A for provider packages
+-------------------------
+
+Q. When upgrading to a new Airflow version such as 2.0, but possibly 2.0.1 and beyond, is the best practice
+   to also upgrade provider packages at the same time?

Review comment:
       can we convert the questions into heading so they can be linked similar to https://raw.githubusercontent.com/apache/airflow/master/docs/faq.rst

##########
File path: docs/installation.rst
##########
@@ -289,6 +303,103 @@ Here's the list of the subpackages and what they enable:
 | winrm               | ``pip install 'apache-airflow[microsoft.winrm]'``   | WinRM hooks and operators                                            |
 +---------------------+-----------------------------------------------------+----------------------------------------------------------------------+
 
+Provider packages
+.................
+
+
+Provider packages context
+-------------------------
+
+Unlike Apache Airflow 1.10, the Airflow 2.0 is delivered in multiple, separate, but connected packages.
+The core of Airflow scheduling system is delivered as ``apache-airflow`` package and there are around
+60 providers packages which can be installed separately as so called "Airflow Provider packages".
+Those provider packages are separated per-provider (for example ``amazon``, ``google``, ``salesforce``
+etc.)  Those packages are available as ``apache-airflow-providers`` packages - separately per each provider
+(for example there is an ``apache-airflow-providers-amazon`` or ``apache-airflow-providers-google`` package.
+
+You can install those provider packages separately in order to interface with a given provider. For those
+providers that have corresponding extras, the provider packages (latest version from PyPI) are installed
+automatically when Airflow is installed with the extra.
+
+Providers are released and versioned separately from the Airflow releases. We are following the
+`Semver <https://semver.org/>`_ versioning scheme for the packages. Some versions of the provider
+packages might depend on particular versions of Airflow, but the general approach we have is that unless
+there is a good reason, new version of providers should work with recent versions of Airflow 2.x. Details
+will vary per-provider and if there is a limitation for particular version of particular provider,
+constraining the Airflow version used, it will be included as limitation of dependencies in the provider
+package.
+
+Some of the providers have cross-provider dependencies as well. Those are not required dependencies, they
+might simply enabled certain features (for example transfer operators often create dependency between
+different providers. Again, the general approach her is that the providers are backwards compatible,
+including cross-dependencies. Any kind of breaking changes and requirements on particular versions of other
+provider packages are automatically documented in the release notes of every provider.
+
+.. note::
+    We also provide ``apache-airflow-backport-providers`` packages that can be installed for Airflow 1.10.
+    Those are the same providers as for 2.0 but automatically back-ported to work for Airflow 1.10. Those
+    backport providers are going to be updated and released for 3 months after Apache Airflow 2.0 release.
+
+Provider packages functionality
+-------------------------------
+
+Separate provider packages provide the possibilities that were not available in 1.10:
+
+1. You can upgrade to latest version of particular providers without the need of Apache Airflow core upgrade.
+
+2. You can downgrade to previous version of particular provider in case the new version introduces
+   some problems, without impacting the main Apache Airflow core package.
+
+3. You can release and upgrade/downgrade provider packages incrementally, independent from each other. This
+   means that you can incrementally validate each of the provider package update in your environment,
+   following the usual tests you have in your environment.
+
+
+Q&A for provider packages
+-------------------------
+
+Q. When upgrading to a new Airflow version such as 2.0, but possibly 2.0.1 and beyond, is the best practice
+   to also upgrade provider packages at the same time?
+
+A. It depends on your use case. If you have automated or semi-automated verification of your installation,
+   that you can run a new version of Airflow including all provider packages, then definitely go for it.
+   If you rely more on manual testing, it is advised that you upgrade in stages. Depending on your choice
+   you can either upgrade all used provider packages first, and then upgrade Airflow Core or the other way
+   round. The first approach - when you first upgrade all providers is probably safer, as you can do it
+   incrementally, step-by-step replacing provider by provider in your environment.
+
+
+Q. I have an Airflow version (1.10.12) running and it is stable. However, because of a Cloud provider change,
+   I would like to upgrade the provider package. If I don't need to upgrade the Airflow version anymore,
+   how do I know that this provider version is compatible with my Airflow version?
+
+
+A. Backport Provider Packages (those are needed in 1.10.* Airflow series) are going to be released for
+   3 months after the release. We will stop releasing new updates to the backport providers afterwards.
+   You will be able to continue using the provider packages that you already use and unless you need to
+   get some new release of the provider that is only released for 2.0, there is no need to upgrade
+   Airflow. This might happen if for example the provider is migrated to use newer version of client
+   libraries or when new features/operators/hooks are added to it. Those changes will only be
+   backported to 1.10.* compatible backport providers up to 3 months after releasing Airflow 2.0.
+   Also we expect more providers, changes and fixes added to the existing providers to come after the
+   3 months pass. Eventually you will have to upgrade to Airflow 2.0 if you would like to make use of those.
+   When it comes to compatibility of providers with different Airflow 2 versions, each
+   provider package will keep it's own dependencies, and while we expect those providers to be generally
+   backwards-compatible, particular versions of particular providers might introduce dependencies on
+   specific Airflow versions.

Review comment:
       ```suggestion
   A. Backport Provider Packages (those are needed in 1.10.* Airflow series) are going to be released for
      3 months after the release. We will stop releasing new updates to the backport providers afterwards.
      You will be able to continue using the provider packages that you already use and unless you need to
      get some new release of the provider that is only released for 2.0, there is no need to upgrade
      Airflow. This might happen if for example the provider is migrated to use a newer version of client
      libraries or when new features/operators/hooks are added to it. Those changes will only be
      backported to 1.10.* compatible backport providers up to 3 months after releasing Airflow 2.0.
      Also, we expect more providers, changes and fixes added to the existing providers to come after the
      3 months pass. Eventually, you will have to upgrade to Airflow 2.0 if you would like to make use of those.
      When it comes to compatibility of providers with different Airflow 2 versions, each
      provider package will keep its own dependencies, and while we expect those providers to be generally
      backwards-compatible, particular versions of particular providers might introduce dependencies on
      specific Airflow versions.
   ```

##########
File path: docs/installation.rst
##########
@@ -289,6 +303,103 @@ Here's the list of the subpackages and what they enable:
 | winrm               | ``pip install 'apache-airflow[microsoft.winrm]'``   | WinRM hooks and operators                                            |
 +---------------------+-----------------------------------------------------+----------------------------------------------------------------------+
 
+Provider packages
+.................
+
+
+Provider packages context
+-------------------------
+
+Unlike Apache Airflow 1.10, the Airflow 2.0 is delivered in multiple, separate, but connected packages.
+The core of Airflow scheduling system is delivered as ``apache-airflow`` package and there are around
+60 providers packages which can be installed separately as so called "Airflow Provider packages".
+Those provider packages are separated per-provider (for example ``amazon``, ``google``, ``salesforce``
+etc.)  Those packages are available as ``apache-airflow-providers`` packages - separately per each provider
+(for example there is an ``apache-airflow-providers-amazon`` or ``apache-airflow-providers-google`` package.
+
+You can install those provider packages separately in order to interface with a given provider. For those
+providers that have corresponding extras, the provider packages (latest version from PyPI) are installed
+automatically when Airflow is installed with the extra.
+
+Providers are released and versioned separately from the Airflow releases. We are following the
+`Semver <https://semver.org/>`_ versioning scheme for the packages. Some versions of the provider
+packages might depend on particular versions of Airflow, but the general approach we have is that unless
+there is a good reason, new version of providers should work with recent versions of Airflow 2.x. Details
+will vary per-provider and if there is a limitation for particular version of particular provider,
+constraining the Airflow version used, it will be included as limitation of dependencies in the provider
+package.
+
+Some of the providers have cross-provider dependencies as well. Those are not required dependencies, they
+might simply enabled certain features (for example transfer operators often create dependency between
+different providers. Again, the general approach her is that the providers are backwards compatible,
+including cross-dependencies. Any kind of breaking changes and requirements on particular versions of other
+provider packages are automatically documented in the release notes of every provider.
+
+.. note::
+    We also provide ``apache-airflow-backport-providers`` packages that can be installed for Airflow 1.10.
+    Those are the same providers as for 2.0 but automatically back-ported to work for Airflow 1.10. Those
+    backport providers are going to be updated and released for 3 months after Apache Airflow 2.0 release.
+
+Provider packages functionality
+-------------------------------
+
+Separate provider packages provide the possibilities that were not available in 1.10:
+
+1. You can upgrade to latest version of particular providers without the need of Apache Airflow core upgrade.
+
+2. You can downgrade to previous version of particular provider in case the new version introduces
+   some problems, without impacting the main Apache Airflow core package.
+
+3. You can release and upgrade/downgrade provider packages incrementally, independent from each other. This
+   means that you can incrementally validate each of the provider package update in your environment,
+   following the usual tests you have in your environment.
+
+
+Q&A for provider packages
+-------------------------
+
+Q. When upgrading to a new Airflow version such as 2.0, but possibly 2.0.1 and beyond, is the best practice
+   to also upgrade provider packages at the same time?
+
+A. It depends on your use case. If you have automated or semi-automated verification of your installation,
+   that you can run a new version of Airflow including all provider packages, then definitely go for it.
+   If you rely more on manual testing, it is advised that you upgrade in stages. Depending on your choice
+   you can either upgrade all used provider packages first, and then upgrade Airflow Core or the other way
+   round. The first approach - when you first upgrade all providers is probably safer, as you can do it
+   incrementally, step-by-step replacing provider by provider in your environment.
+
+
+Q. I have an Airflow version (1.10.12) running and it is stable. However, because of a Cloud provider change,
+   I would like to upgrade the provider package. If I don't need to upgrade the Airflow version anymore,
+   how do I know that this provider version is compatible with my Airflow version?
+
+
+A. Backport Provider Packages (those are needed in 1.10.* Airflow series) are going to be released for
+   3 months after the release. We will stop releasing new updates to the backport providers afterwards.
+   You will be able to continue using the provider packages that you already use and unless you need to
+   get some new release of the provider that is only released for 2.0, there is no need to upgrade
+   Airflow. This might happen if for example the provider is migrated to use newer version of client
+   libraries or when new features/operators/hooks are added to it. Those changes will only be
+   backported to 1.10.* compatible backport providers up to 3 months after releasing Airflow 2.0.
+   Also we expect more providers, changes and fixes added to the existing providers to come after the
+   3 months pass. Eventually you will have to upgrade to Airflow 2.0 if you would like to make use of those.
+   When it comes to compatibility of providers with different Airflow 2 versions, each
+   provider package will keep it's own dependencies, and while we expect those providers to be generally
+   backwards-compatible, particular versions of particular providers might introduce dependencies on
+   specific Airflow versions.
+
+
+Q. I have an older version of my provider package which we have lightly customized and is working
+   fine with my MSSQL installation. I am upgrading my Airflow version. Do I need to upgrade my provider,
+   or can I keep it as it is.
+
+A. It depends on the scope of customization. There is no need to upgrade the provider packages to later
+   versions unless you want to upgrade to Airflow version that introduces backwards incompatible changes.
+   Generally speaking, with Airflow 2 we are following the `Semver <https://semver.org/>`_  approach where
+   we will introduce backwards-incompatible changes in Major releases, so all your modifications (as long
+   as you have not used internal Airflow classes) should work for All Airflow 2.* versions.

Review comment:
       ```suggestion
   A. It depends on the scope of customization. There is no need to upgrade the provider packages to later
      versions unless you want to upgrade to Airflow version that introduces backwards-incompatible changes.
      Generally speaking, with Airflow 2 we are following the `Semver <https://semver.org/>`_  approach where
      we will introduce backwards-incompatible changes in Major releases, so all your modifications (as long
      as you have not used internal Airflow classes) should work for All Airflow 2.* versions.
   ```

##########
File path: docs/installation.rst
##########
@@ -67,6 +67,20 @@ and python versions in the URL.
     CONSTRAINT_URL="https://raw.githubusercontent.com/apache/airflow/constraints-${AIRFLOW_VERSION}/constraints-${PYTHON_VERSION}.txt"
     pip install "apache-airflow[postgres,google]==${AIRFLOW_VERSION}" --constraint "${CONSTRAINT_URL}"
 
+Most of the extras are linked to a corresponding providers package. For example "amazon" extra
+has a corresponding 'apache-airflow-providers-amazon' providers package to be installed. When you install

Review comment:
       ```suggestion
   has a corresponding ``apache-airflow-providers-amazon`` providers package to be installed. When you install
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #12203: Adds provider package documentation in installation.rst

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #12203:
URL: https://github.com/apache/airflow/pull/12203#discussion_r519984624



##########
File path: docs/installation.rst
##########
@@ -289,6 +299,100 @@ Here's the list of the subpackages and what they enable:
 | winrm               | ``pip install 'apache-airflow[microsoft.winrm]'``   | WinRM hooks and operators                                            |
 +---------------------+-----------------------------------------------------+----------------------------------------------------------------------+
 
+Provider packages
+.................
+
+
+Provider packages context

Review comment:
       Well, it actually was this way already. 
   
   Unless you installed extras, the number of dependencies for the core airlfow was rather small and this has not changed with "provider packages" - we have basically the same set of requirements as before. If you did 'pip install apache-airflow' - the same dependencies are pulled in 1.10 and 2.0. And when you do 'pip install apache-airflow[google]'  - it pulls in the same set of dependencies in 1.10 and 2.0 (but in 2.0 those will be transitive dependencies for google provider, not the extra dependencies). The only difference was that if you have not installed 'google' extra before, the google operators were there in 'airflow.contrib', but they could not be imported because they missed dependencies.
   
   So technically speaking - 2.0 is not any "lighter" in terms of dependencies. It behaves very much the same as it did before.
   
   Maybe we can add some wording around it but literally, the only difference was that previously the classes were there (unusable) and now they are gone if you don't explicitly add 'extra' or install provider package manually. All dependencies remain the same. So for me this is not a differentiating feature. But maybe it is indeed worth spelling out (but I am not sure how :)) 
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] petedejoy commented on a change in pull request #12203: Adds provider package documentation in installation.rst

Posted by GitBox <gi...@apache.org>.
petedejoy commented on a change in pull request #12203:
URL: https://github.com/apache/airflow/pull/12203#discussion_r519995890



##########
File path: docs/installation.rst
##########
@@ -289,6 +299,100 @@ Here's the list of the subpackages and what they enable:
 | winrm               | ``pip install 'apache-airflow[microsoft.winrm]'``   | WinRM hooks and operators                                            |
 +---------------------+-----------------------------------------------------+----------------------------------------------------------------------+
 
+Provider packages
+.................
+
+
+Provider packages context

Review comment:
       Ahh that makes sense- I didn't realize this was how it worked pre-2.0! Thanks for clearing it up, probably ok to leave as is for now then.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #12203: Adds provider package documentation in installation.rst

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #12203:
URL: https://github.com/apache/airflow/pull/12203#issuecomment-724839252


   The PR is ready to be merged. No tests are needed!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #12203: Adds provider package documentation in installation.rst

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #12203:
URL: https://github.com/apache/airflow/pull/12203#issuecomment-724556653


   @kaxil @ashb  I also updated the installation documentation with information that for b1 the provider packages are not installed automatically. I think we should merge the installation documentation update now regardless if we decide to release b2 quickly or not.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #12203: Adds provider package documentation in installation.rst

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #12203:
URL: https://github.com/apache/airflow/pull/12203#issuecomment-724556653


   @kaxil @ashb  I also updated the installation documentation with information that for b1 the provider packages are not installed automatically. I think we should merge the installation documentation update now regardless whether we decide to release b2 quickly or not.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] petedejoy commented on a change in pull request #12203: Adds provider package documentation in installation.rst

Posted by GitBox <gi...@apache.org>.
petedejoy commented on a change in pull request #12203:
URL: https://github.com/apache/airflow/pull/12203#discussion_r519922920



##########
File path: docs/installation.rst
##########
@@ -289,6 +299,100 @@ Here's the list of the subpackages and what they enable:
 | winrm               | ``pip install 'apache-airflow[microsoft.winrm]'``   | WinRM hooks and operators                                            |
 +---------------------+-----------------------------------------------------+----------------------------------------------------------------------+
 
+Provider packages
+.................
+
+
+Provider packages context
+-------------------------
+
+Unlike Apache Airflow 1.10, the Airflow 2.0 is delivered in multiple, separate, but connected packages.
+The core of Airflow scheduling system is delivered as ``apache-airflow`` package and there are around
+60 providers packages which can be installed separately as so called "Airflow Provider packages".
+Those provider packages are separated per-provider (for example ``amazon``, ``google``, ``salesforce``
+etc.)  Those packages are available as ``apache-airflow-providers`` packages - separately per ech provider
+(for example there is an ``apache-airflow-providers-amazon`` or ``apache-airflow-providers-google`` package.
+
+You can install those provider packages separately in order to interface with a given provider. For those
+providers that have corresponding extras, the provider packages (latest version from PyPI) are installed
+automatically when Airflow is installed with the extra.
+
+Providers are released and versioned separately from the Airflow releases. We are following the
+`Semver <https://semver.org/>`_ versioning scheme for the packages. Some versions of the provider
+packages might depend on particular versions of Airflow, but the general approach we have is that unless
+there is a good reason, new version of providers should work with recent versions of Airflow 2.x. Details
+will vary per-provider and if there is a limitation for particular version of particular provider,
+constraining the Airflow version used, it will be included as limitation of dependencies in the provider
+package.
+
+Some of the providers have cross-provider dependencies as well. Those are not required dependencies, they
+might simply enabled certain features (for example transfer operators often create dependency between
+different providers. Again, the general approach her is that the providers are backwards compatible,
+including cross-dependencies. Any kind of breaking changes and requirements on particular versions of other
+provider packages are automatically documented in the release notes of every provider.
+
+.. note::
+    We also provide ``apache-airflow-backport-providers`` packages that can be installed for Airflow 1.10.
+    Those are the same providers as for 2.0 but automatically back-ported to work for Airflow 1.10. Those
+    backport providers are going to be updated and released for 3 months after Apache Airflow 2.0 release.
+
+Provider packages functionality
+-------------------------------
+
+Separate provider packages provide the possibilities that were not available in 1.10:
+
+1. You can upgrade to latest version of particular providers without the need of Apache Airflow core upgrade
+
+2. You can downgrade to previous version of particular provider in case the new version introduces
+   some problems, without impacting the main Apache Airflow core package.
+
+3. You can release and upgrade/downgrade provider packages incrementally, independent from each other. This
+   means that you can incrementally validate each of the provider package update in your environment,
+   following the usual tests you have in your environment.
+
+
+Q&A for provider packages
+-------------------------
+
+Q. When upgrading to a new Airflow version such as 2.0, but possibly 2.0.1 and beyond, is the best practice
+   to also upgrade provider packages at the same time?
+
+A. It depends on your use case. If you have automated or semi-automated verification of your installation,
+   that you can run a new version of Airflow including all provider packages, then definitely go for it.
+   If you rely more on manual testing, it is advised that you upgrade in stages. Depending on your choice
+   you can either upgrade all used provider packages first, and then upgrade Airflow Core or the other way
+   round. The first approach - when you first upgrade all providers is probably safer, as you can do it
+   incrementally, step-by-step replacing provider by provider in your environment.
+
+
+Q. I have an Airflow version (1.10.12) running and it is stable. However, because of a Cloud provider change,
+   I would like to upgrade the provider package. If I don't need to upgrade the Airflow version anymore,
+   how do I know that this provider version is compatible with my Airflow version?
+
+
+A. Backport Provider Packages (those are needed in 1.10.* Airflow series) are going to be released for
+   3 months after the release. We will stop releasing new updates to the backport providers afterwards.
+   You will ba able to continue using the provider packages that you already use and unless you need to

Review comment:
       Small typo here- `be` :) 

##########
File path: docs/installation.rst
##########
@@ -289,6 +299,100 @@ Here's the list of the subpackages and what they enable:
 | winrm               | ``pip install 'apache-airflow[microsoft.winrm]'``   | WinRM hooks and operators                                            |
 +---------------------+-----------------------------------------------------+----------------------------------------------------------------------+
 
+Provider packages
+.................
+
+
+Provider packages context
+-------------------------
+
+Unlike Apache Airflow 1.10, the Airflow 2.0 is delivered in multiple, separate, but connected packages.
+The core of Airflow scheduling system is delivered as ``apache-airflow`` package and there are around
+60 providers packages which can be installed separately as so called "Airflow Provider packages".
+Those provider packages are separated per-provider (for example ``amazon``, ``google``, ``salesforce``
+etc.)  Those packages are available as ``apache-airflow-providers`` packages - separately per ech provider
+(for example there is an ``apache-airflow-providers-amazon`` or ``apache-airflow-providers-google`` package.
+
+You can install those provider packages separately in order to interface with a given provider. For those
+providers that have corresponding extras, the provider packages (latest version from PyPI) are installed
+automatically when Airflow is installed with the extra.
+
+Providers are released and versioned separately from the Airflow releases. We are following the
+`Semver <https://semver.org/>`_ versioning scheme for the packages. Some versions of the provider
+packages might depend on particular versions of Airflow, but the general approach we have is that unless
+there is a good reason, new version of providers should work with recent versions of Airflow 2.x. Details
+will vary per-provider and if there is a limitation for particular version of particular provider,
+constraining the Airflow version used, it will be included as limitation of dependencies in the provider
+package.
+
+Some of the providers have cross-provider dependencies as well. Those are not required dependencies, they
+might simply enabled certain features (for example transfer operators often create dependency between
+different providers. Again, the general approach her is that the providers are backwards compatible,
+including cross-dependencies. Any kind of breaking changes and requirements on particular versions of other
+provider packages are automatically documented in the release notes of every provider.
+
+.. note::
+    We also provide ``apache-airflow-backport-providers`` packages that can be installed for Airflow 1.10.
+    Those are the same providers as for 2.0 but automatically back-ported to work for Airflow 1.10. Those
+    backport providers are going to be updated and released for 3 months after Apache Airflow 2.0 release.
+
+Provider packages functionality
+-------------------------------
+
+Separate provider packages provide the possibilities that were not available in 1.10:
+
+1. You can upgrade to latest version of particular providers without the need of Apache Airflow core upgrade

Review comment:
       Missing a period here.

##########
File path: docs/installation.rst
##########
@@ -289,6 +299,100 @@ Here's the list of the subpackages and what they enable:
 | winrm               | ``pip install 'apache-airflow[microsoft.winrm]'``   | WinRM hooks and operators                                            |
 +---------------------+-----------------------------------------------------+----------------------------------------------------------------------+
 
+Provider packages
+.................
+
+
+Provider packages context
+-------------------------
+
+Unlike Apache Airflow 1.10, the Airflow 2.0 is delivered in multiple, separate, but connected packages.
+The core of Airflow scheduling system is delivered as ``apache-airflow`` package and there are around
+60 providers packages which can be installed separately as so called "Airflow Provider packages".
+Those provider packages are separated per-provider (for example ``amazon``, ``google``, ``salesforce``
+etc.)  Those packages are available as ``apache-airflow-providers`` packages - separately per ech provider
+(for example there is an ``apache-airflow-providers-amazon`` or ``apache-airflow-providers-google`` package.
+
+You can install those provider packages separately in order to interface with a given provider. For those
+providers that have corresponding extras, the provider packages (latest version from PyPI) are installed
+automatically when Airflow is installed with the extra.
+
+Providers are released and versioned separately from the Airflow releases. We are following the
+`Semver <https://semver.org/>`_ versioning scheme for the packages. Some versions of the provider
+packages might depend on particular versions of Airflow, but the general approach we have is that unless
+there is a good reason, new version of providers should work with recent versions of Airflow 2.x. Details
+will vary per-provider and if there is a limitation for particular version of particular provider,
+constraining the Airflow version used, it will be included as limitation of dependencies in the provider
+package.
+
+Some of the providers have cross-provider dependencies as well. Those are not required dependencies, they
+might simply enabled certain features (for example transfer operators often create dependency between
+different providers. Again, the general approach her is that the providers are backwards compatible,
+including cross-dependencies. Any kind of breaking changes and requirements on particular versions of other
+provider packages are automatically documented in the release notes of every provider.
+
+.. note::
+    We also provide ``apache-airflow-backport-providers`` packages that can be installed for Airflow 1.10.
+    Those are the same providers as for 2.0 but automatically back-ported to work for Airflow 1.10. Those
+    backport providers are going to be updated and released for 3 months after Apache Airflow 2.0 release.
+
+Provider packages functionality
+-------------------------------
+
+Separate provider packages provide the possibilities that were not available in 1.10:
+
+1. You can upgrade to latest version of particular providers without the need of Apache Airflow core upgrade
+
+2. You can downgrade to previous version of particular provider in case the new version introduces
+   some problems, without impacting the main Apache Airflow core package.
+
+3. You can release and upgrade/downgrade provider packages incrementally, independent from each other. This
+   means that you can incrementally validate each of the provider package update in your environment,
+   following the usual tests you have in your environment.
+
+
+Q&A for provider packages
+-------------------------
+
+Q. When upgrading to a new Airflow version such as 2.0, but possibly 2.0.1 and beyond, is the best practice
+   to also upgrade provider packages at the same time?
+
+A. It depends on your use case. If you have automated or semi-automated verification of your installation,
+   that you can run a new version of Airflow including all provider packages, then definitely go for it.
+   If you rely more on manual testing, it is advised that you upgrade in stages. Depending on your choice
+   you can either upgrade all used provider packages first, and then upgrade Airflow Core or the other way
+   round. The first approach - when you first upgrade all providers is probably safer, as you can do it
+   incrementally, step-by-step replacing provider by provider in your environment.
+
+
+Q. I have an Airflow version (1.10.12) running and it is stable. However, because of a Cloud provider change,
+   I would like to upgrade the provider package. If I don't need to upgrade the Airflow version anymore,
+   how do I know that this provider version is compatible with my Airflow version?
+
+
+A. Backport Provider Packages (those are needed in 1.10.* Airflow series) are going to be released for
+   3 months after the release. We will stop releasing new updates to the backport providers afterwards.
+   You will ba able to continue using the provider packages that you already use and unless you need to
+   get some new release of the provider that is only released in master, there is no need to upgrade

Review comment:
       What would be a case in which this happens (ie. a provider is only released in master)? If the BaseOperator changes or something?

##########
File path: docs/installation.rst
##########
@@ -289,6 +299,100 @@ Here's the list of the subpackages and what they enable:
 | winrm               | ``pip install 'apache-airflow[microsoft.winrm]'``   | WinRM hooks and operators                                            |
 +---------------------+-----------------------------------------------------+----------------------------------------------------------------------+
 
+Provider packages
+.................
+
+
+Provider packages context

Review comment:
       My perception could be wrong, but a big benefit I see to this feature is how "dependency-light" it makes the Airflow base. Since all of these providers (which have their own sets of complex dependencies) are no longer included in the base, a user is less likely to run into one-off dependency conflicts when trying to install additional libraries. I wonder if that's worth calling out in the context here? Curious to hear your thoughts @potiuk.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #12203: Adds provider package documentation in installation.rst

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #12203:
URL: https://github.com/apache/airflow/pull/12203#discussion_r519978899



##########
File path: docs/installation.rst
##########
@@ -289,6 +299,100 @@ Here's the list of the subpackages and what they enable:
 | winrm               | ``pip install 'apache-airflow[microsoft.winrm]'``   | WinRM hooks and operators                                            |
 +---------------------+-----------------------------------------------------+----------------------------------------------------------------------+
 
+Provider packages
+.................
+
+
+Provider packages context
+-------------------------
+
+Unlike Apache Airflow 1.10, the Airflow 2.0 is delivered in multiple, separate, but connected packages.
+The core of Airflow scheduling system is delivered as ``apache-airflow`` package and there are around
+60 providers packages which can be installed separately as so called "Airflow Provider packages".
+Those provider packages are separated per-provider (for example ``amazon``, ``google``, ``salesforce``
+etc.)  Those packages are available as ``apache-airflow-providers`` packages - separately per ech provider
+(for example there is an ``apache-airflow-providers-amazon`` or ``apache-airflow-providers-google`` package.
+
+You can install those provider packages separately in order to interface with a given provider. For those
+providers that have corresponding extras, the provider packages (latest version from PyPI) are installed
+automatically when Airflow is installed with the extra.
+
+Providers are released and versioned separately from the Airflow releases. We are following the
+`Semver <https://semver.org/>`_ versioning scheme for the packages. Some versions of the provider
+packages might depend on particular versions of Airflow, but the general approach we have is that unless
+there is a good reason, new version of providers should work with recent versions of Airflow 2.x. Details
+will vary per-provider and if there is a limitation for particular version of particular provider,
+constraining the Airflow version used, it will be included as limitation of dependencies in the provider
+package.
+
+Some of the providers have cross-provider dependencies as well. Those are not required dependencies, they
+might simply enabled certain features (for example transfer operators often create dependency between
+different providers. Again, the general approach her is that the providers are backwards compatible,
+including cross-dependencies. Any kind of breaking changes and requirements on particular versions of other
+provider packages are automatically documented in the release notes of every provider.
+
+.. note::
+    We also provide ``apache-airflow-backport-providers`` packages that can be installed for Airflow 1.10.
+    Those are the same providers as for 2.0 but automatically back-ported to work for Airflow 1.10. Those
+    backport providers are going to be updated and released for 3 months after Apache Airflow 2.0 release.
+
+Provider packages functionality
+-------------------------------
+
+Separate provider packages provide the possibilities that were not available in 1.10:
+
+1. You can upgrade to latest version of particular providers without the need of Apache Airflow core upgrade
+
+2. You can downgrade to previous version of particular provider in case the new version introduces
+   some problems, without impacting the main Apache Airflow core package.
+
+3. You can release and upgrade/downgrade provider packages incrementally, independent from each other. This
+   means that you can incrementally validate each of the provider package update in your environment,
+   following the usual tests you have in your environment.
+
+
+Q&A for provider packages
+-------------------------
+
+Q. When upgrading to a new Airflow version such as 2.0, but possibly 2.0.1 and beyond, is the best practice
+   to also upgrade provider packages at the same time?
+
+A. It depends on your use case. If you have automated or semi-automated verification of your installation,
+   that you can run a new version of Airflow including all provider packages, then definitely go for it.
+   If you rely more on manual testing, it is advised that you upgrade in stages. Depending on your choice
+   you can either upgrade all used provider packages first, and then upgrade Airflow Core or the other way
+   round. The first approach - when you first upgrade all providers is probably safer, as you can do it
+   incrementally, step-by-step replacing provider by provider in your environment.
+
+
+Q. I have an Airflow version (1.10.12) running and it is stable. However, because of a Cloud provider change,
+   I would like to upgrade the provider package. If I don't need to upgrade the Airflow version anymore,
+   how do I know that this provider version is compatible with my Airflow version?
+
+
+A. Backport Provider Packages (those are needed in 1.10.* Airflow series) are going to be released for
+   3 months after the release. We will stop releasing new updates to the backport providers afterwards.
+   You will ba able to continue using the provider packages that you already use and unless you need to
+   get some new release of the provider that is only released in master, there is no need to upgrade

Review comment:
       Clarified. Not really. We expect some future changes to providers. A good example might be Google Python client 2.0.*. Google released 2.0 python clients recently for most of their services. Those are not backwards compatible. Most of the google providers for 2.0 will eventually migrate to those libraries and it's rather unlikely this will happen before the 3 months pass after Airflow 2.0, so those providers based on new clients will only be available for Airflow 2.0 (when the providers migrate to it) because we will not release backport providers then.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #12203: Adds provider package documentation in installation.rst

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #12203:
URL: https://github.com/apache/airflow/pull/12203#issuecomment-724747133


   Also added a TOC to the doc.  I think it might be "good enough" to start directing users of b2 to it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #12203: Adds provider package documentation in installation.rst

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #12203:
URL: https://github.com/apache/airflow/pull/12203#issuecomment-724103311


   [The Workflow run](https://github.com/apache/airflow/actions/runs/354254768) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #12203: Adds provider package documentation in installation.rst

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #12203:
URL: https://github.com/apache/airflow/pull/12203#discussion_r520611671



##########
File path: docs/installation.rst
##########
@@ -289,6 +303,103 @@ Here's the list of the subpackages and what they enable:
 | winrm               | ``pip install 'apache-airflow[microsoft.winrm]'``   | WinRM hooks and operators                                            |
 +---------------------+-----------------------------------------------------+----------------------------------------------------------------------+
 
+Provider packages
+.................
+
+
+Provider packages context
+-------------------------
+
+Unlike Apache Airflow 1.10, the Airflow 2.0 is delivered in multiple, separate, but connected packages.
+The core of Airflow scheduling system is delivered as ``apache-airflow`` package and there are around
+60 providers packages which can be installed separately as so called "Airflow Provider packages".
+Those provider packages are separated per-provider (for example ``amazon``, ``google``, ``salesforce``
+etc.)  Those packages are available as ``apache-airflow-providers`` packages - separately per each provider
+(for example there is an ``apache-airflow-providers-amazon`` or ``apache-airflow-providers-google`` package.
+
+You can install those provider packages separately in order to interface with a given provider. For those
+providers that have corresponding extras, the provider packages (latest version from PyPI) are installed
+automatically when Airflow is installed with the extra.
+
+Providers are released and versioned separately from the Airflow releases. We are following the
+`Semver <https://semver.org/>`_ versioning scheme for the packages. Some versions of the provider
+packages might depend on particular versions of Airflow, but the general approach we have is that unless
+there is a good reason, new version of providers should work with recent versions of Airflow 2.x. Details
+will vary per-provider and if there is a limitation for particular version of particular provider,
+constraining the Airflow version used, it will be included as limitation of dependencies in the provider
+package.
+
+Some of the providers have cross-provider dependencies as well. Those are not required dependencies, they
+might simply enabled certain features (for example transfer operators often create dependency between
+different providers. Again, the general approach her is that the providers are backwards compatible,
+including cross-dependencies. Any kind of breaking changes and requirements on particular versions of other
+provider packages are automatically documented in the release notes of every provider.
+
+.. note::
+    We also provide ``apache-airflow-backport-providers`` packages that can be installed for Airflow 1.10.
+    Those are the same providers as for 2.0 but automatically back-ported to work for Airflow 1.10. Those
+    backport providers are going to be updated and released for 3 months after Apache Airflow 2.0 release.
+
+Provider packages functionality
+-------------------------------
+
+Separate provider packages provide the possibilities that were not available in 1.10:
+
+1. You can upgrade to latest version of particular providers without the need of Apache Airflow core upgrade.
+
+2. You can downgrade to previous version of particular provider in case the new version introduces
+   some problems, without impacting the main Apache Airflow core package.
+
+3. You can release and upgrade/downgrade provider packages incrementally, independent from each other. This
+   means that you can incrementally validate each of the provider package update in your environment,
+   following the usual tests you have in your environment.
+
+
+Q&A for provider packages
+-------------------------
+
+Q. When upgrading to a new Airflow version such as 2.0, but possibly 2.0.1 and beyond, is the best practice
+   to also upgrade provider packages at the same time?

Review comment:
       I think our questions are a bit too long to make them into headers. But I divided the Q&A into sections (for now one Q/A per section) and bolded the questions there. Later we might add some more questions in each section.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #12203: Adds provider package documentation in installation.rst

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #12203:
URL: https://github.com/apache/airflow/pull/12203#issuecomment-724157408


   Pushed fixup :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #12203: Adds provider package documentation in installation.rst

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #12203:
URL: https://github.com/apache/airflow/pull/12203


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org