You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/12/08 10:42:37 UTC

[GitHub] [airflow] ashb opened a new pull request #12916: The 4 providers (http, ftp, sqlite, imap) are popular and they do not require any additionl dependencies so we decided to include them by default in Airflow 2.0

ashb opened a new pull request #12916:
URL: https://github.com/apache/airflow/pull/12916


   @potiuk Please take a look -- this is a simple subclassing of Distribution that means we can keep things in setup.cfg (which I get the impression the pip team want to encourage -- as it is statically parsable.)
   
   Closes #12744, #12903
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #12916: Adds predefined providers to install_requires.

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #12916:
URL: https://github.com/apache/airflow/pull/12916#issuecomment-740583712


   > ![Screenshot from 2020-12-08 13-08-19](https://user-images.githubusercontent.com/595491/101482206-a4d00900-3956-11eb-914b-01c497a5bb78.png)
   > 
   > @RosterIn - I hope that will make it clear :)
   
   And the wording in "naming convention" is I think rather clear:
   
   > It is often debatable where to put transfer operators but we agreed to the following criteria:
   > * We use "maintainability" of the operators as the main criteria - so the transfer operator should be kept at the provider which has highest "interest" in the transfer operator
   > * For Cloud Providers or Service providers that usually means that the transfer operators should land at the "target" side of the transfer


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #12916: Adds predefined providers to install_requires.

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #12916:
URL: https://github.com/apache/airflow/pull/12916#issuecomment-740581957


   ![Screenshot from 2020-12-08 13-08-19](https://user-images.githubusercontent.com/595491/101482206-a4d00900-3956-11eb-914b-01c497a5bb78.png)
   
   @RosterIn - I hope that will make it clear :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] RosterIn commented on pull request #12916: The 4 providers (http, ftp, sqlite, imap) are popular and they do not require any additionl dependencies so we decided to include them by default in Airflow 2.0

Posted by GitBox <gi...@apache.org>.
RosterIn commented on pull request #12916:
URL: https://github.com/apache/airflow/pull/12916#issuecomment-740565094


   Question related to [AIP-21](https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-21%3A+Changes+in+import+paths#AIP21:Changesinimportpaths-Case#5*Operator):
   "In case of transfer operators where two providers are involved, the transfer operators will be moved to "source" of the transfer. "
   
   What will happen if `FtpToXOperator` will be added? This will add `X` as dependency to ftp provider.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #12916: Adds predefined providers to install_requires.

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #12916:
URL: https://github.com/apache/airflow/pull/12916


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #12916: Adds predefined providers to install_requires.

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #12916:
URL: https://github.com/apache/airflow/pull/12916#issuecomment-740578946


   > What will happen if `FtpToXOperator` will be added? This will add `X` as dependency to ftp provider
   
   This AIP has been updated several times, and while we keep all the historical votings, there are also some subsequent changes. So AIP is mostly kept as historical records and it contains several updates (there were several rounds of additional changes and votes after the original voting passed).  
   
   If you want to see final decisions made and current status of naming conventions just look here (I will update that information in the AIP-21 to direct people there): 
   
   https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#naming-conventions-for-provider-packages
   
   First of all FTPToX will by default land in "X" not in FTP (that has reversed since the first AIP-21 voting). The main criteria we have when we decide is who is most interested (stakeholder) in maintaining specific provider (and assumption is that the "target" side cares more and has more stakes in getting the data). 
   
   So FTPtoS3 will go to Amazon, FTPToGCS will go to Google. 
   
   In case of FTP/HTTP/IMAP/SQLITE - those are "generic" providers that have no clear stakeholders behind, so even GCSToFTP will go to Google and S3ToFTP will go to Amazon, even if they re the "source", this is a case-by-case decision but in case of very generic operators, databases which are not run "as a service" only, this is rather clear which direction it should go. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #12916: Adds predefined providers to install_requires.

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #12916:
URL: https://github.com/apache/airflow/pull/12916#issuecomment-740583712


   
   And the wording in "naming convention" is I think rather clear:
   
   > It is often debatable where to put transfer operators but we agreed to the following criteria:
   > * We use "maintainability" of the operators as the main criteria - so the transfer operator should be kept at the provider which has highest "interest" in the transfer operator
   > * For Cloud Providers or Service providers that usually means that the transfer operators should land at the "target" side of the transfer


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #12916: Adds predefined providers to install_requires.

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #12916:
URL: https://github.com/apache/airflow/pull/12916#issuecomment-740583712


   > ![Screenshot from 2020-12-08 13-08-19](https://user-images.githubusercontent.com/595491/101482206-a4d00900-3956-11eb-914b-01c497a5bb78.png)
   > 
   > @RosterIn - I hope that will make it clear :)
   
   And the wording in "naming convention" is I think rather clear:
   
   It is often debatable where to put transfer operators but we agreed to the following criteria:
   * We use "maintainability" of the operators as the main criteria - so the transfer operator should be kept at the provider which has highest "interest" in the transfer operator
   * For Cloud Providers or Service providers that usually means that the transfer operators should land at the "target" side of the transfer


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #12916: Adds predefined providers to install_requires.

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #12916:
URL: https://github.com/apache/airflow/pull/12916#discussion_r538321436



##########
File path: setup.py
##########
@@ -874,6 +874,22 @@ def is_package_excluded(package: str, exclusion_list: List[str]):
 )
 
 
+class AirflowDistribtuion(Distribution):
+    """setuptools.Distribution subclass with Airflow specific behaviour"""
+
+    # https://github.com/PyCQA/pylint/issues/3737
+    def parse_config_files(self, *args, **kwargs):  # pylint: disable=signature-differs
+        """
+        Ensure that when we have been asked to install providers from sources
+        that we don't *also* try ot install those providers from PyPI

Review comment:
       ```suggestion
           that we don't *also* try to install those providers from PyPI
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org