You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/04/18 01:23:38 UTC

[GitHub] [airflow] mik-laj opened a new pull request #8432: Provide GCP credentials in Bash/Python operators

mik-laj opened a new pull request #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432
 
 
   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [X] Description above provides context of the change
   - [X] Unit tests coverage for changes (not needed for documentation changes)
   - [X] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [X] Relevant documentation is updated including usage instructions.
   - [X] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] potiuk commented on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615914795
 
 
   > I will present something generic for moments, which will not require the transfer of a complex object. It's as easy to use as it is now, but it will be more generic. It will be something similar to the code below.
   > 
   
   That looks much better indeed. Let's see how this will look like eventually :) 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] ashb commented on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615732640
 
 
   Yeah I agree this feels very leaky.
   
   Being able to pass _any_ Airflow connection securely in to Bash (or SSH) Op feels like a useful ability. GCP, or any special case less so.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] codecov-io edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615537382
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=h1) Report
   > Merging [#8432](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=desc) into [master](https://codecov.io/gh/apache/airflow/commit/96df427e07601e331afd6990ce7613b2026acfe0&el=desc) will **decrease** coverage by `0.00%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/8432/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff            @@
   ##           master   #8432      +/-   ##
   =========================================
   - Coverage    6.23%   6.22%   -0.01%     
   =========================================
     Files         946     950       +4     
     Lines       45661   45723      +62     
   =========================================
     Hits         2846    2846              
   - Misses      42815   42877      +62     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...rflow/example\_dags/example\_google\_bash\_operator.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9nb29nbGVfYmFzaF9vcGVyYXRvci5weQ==) | `0.00% <0.00%> (ø)` | |
   | [...dags/example\_google\_bash\_operator\_custom\_script.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9nb29nbGVfYmFzaF9vcGVyYXRvcl9jdXN0b21fc2NyaXB0LnB5) | `0.00% <0.00%> (ø)` | |
   | [...low/example\_dags/example\_google\_python\_operator.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9nb29nbGVfcHl0aG9uX29wZXJhdG9yLnB5) | `0.00% <0.00%> (ø)` | |
   | [airflow/operators/bash.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvYmFzaC5weQ==) | `0.00% <0.00%> (ø)` | |
   | [airflow/operators/python.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvcHl0aG9uLnB5) | `0.00% <0.00%> (ø)` | |
   | [airflow/utils/documentation.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9kb2N1bWVudGF0aW9uLnB5) | `0.00% <0.00%> (ø)` | |
   | [airflow/www/app.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvYXBwLnB5) | `0.00% <0.00%> (ø)` | |
   | ... and [1 more](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=footer). Last update [96df427...235382f](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] kaxil commented on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615865246
 
 
   I agree with Jarek here, any cloud specific code in Core Operators would hurt my eyes.
   
   Each cloud provider also provides a way to authenticate using an Environment Variable too. In case of GCP it is GOOGLE_APPLICATION_CREDENTIALS which can be used for Bash and Python Operators.
   
   And if the Virtual Machine is on that Cloud Provider itself you don't need to authenticate as you can VM's credentials tooo

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj commented on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615910786
 
 
   I will present something generic for moments, which will not require the transfer of a complex object. It's as easy to use as it is now, but it will be more generic. It will be something similar to the code below.
   ```
                       exit_stack.enter_context(  # pylint: disable=no-member
                           BaseHook.get_connection(conn).get_hook(
                               gcp_conn_id=self.gcp_conn_id, delegate_to=self.gcp_gcp_delegate_to
                           ).provide_authorization()
                       )
   ```
   New services will continue to be added by composition rather than inheritance.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] potiuk commented on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615728611
 
 
   I disagree. It's not a matter about inheritance vs. composition. I think we can find a good solution for that - but It's the matter of bringing GCP specific code to shared Bash Operator. Not everyone uses GCP and having code that is GCP-specific in general-purpose operator hurts my eyes.
   
   BashOperator should know nothing about GCP.  Maybe indeed this should be done via plugins or similar solution.
   
   I wonder what others thing about it @turbaszek @kaxil @ashb ? Should BashOperator contain GCP-specific code? WDYT?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615560449
 
 
   @potiuk This contradicts the whole idea and the need for this operator. BashOperator and PythonOperaator are very useful because it is universal. Bash and Python are also built by compositions. New applications are installed on the system and can be used by any tool. If we inherit and make customization GCP-specific, we will limit its functionality. It will no longer be a universal operator. You will only be able to use it with one provider.  I think this is a similar problem to ``Connection.get_hook``.
   https://github.com/apache/airflow/blob/master/airflow/models/connection.py#L301
   This method is useful because it is universal and can be used regardless of the provider.
   In the future, if we need to separate the core and providers, we can extend this class with a plugin. A plugin that will add new parameters to the class. Like the get_hook method, it should use the plugin mechanism.
   
   I hope that in the future new parameters will be added for other cloud providers, e.g. AWS.
   ```python
       cross_platform_task = BashOperator(
           task_id='gcloud',
           bash_command=(
               'gsutil cp gs//bucket/a.txt a.txt && aws s3 cp test.txt s3://mybucket/test2.txt'
           ),
           gcp_conn_id=GCP_PROJECT_ID,
           aws_conn_id=AWS_PROJECT_ID,
       )
   ```
   Then it will still be a universal operator and we will not build a vendor-lock for one providers.
   
   From an architectural point of view. Here the use of inheritance will be bad, but we should composition. Inheritance will limit these operators too much.  
   I invite you to read the article.https://en.wikipedia.org/wiki/Composition_over_inheritance
   
   I will only cite one fragment.
   >Note that multiple inheritance is dangerous if not implemented carefully, as it can lead to the diamond problem. One solution to avoid this is to create classes such as **VisibleAndSolid**, **VisibleAndMovable**, **VisibleAndSolidAndMovable**, etc. for every needed combination, though this leads to a large amount of repetitive code. 
   
   If we replace some words, we have our problem.
   
   >Note that multiple inheritance is dangerous if not implemented carefully, as it can lead to the diamond problem. One solution to avoid this is to create classes such as **GoogleAndAws**, **GoogleAndAzure**, **AwsAndAzureAndGoogle**, etc. for every needed combination, though this leads to a large amount of repetitive code.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] turbaszek commented on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
turbaszek commented on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615884076
 
 
   I agree that having provider-specific code in Bash / Python ops doesn't sound good. However, I think that it would be nice to help users somehow to authorize in those ops.
   
   Other idea I have is to pass an authorization context manager to operators:
   ```python
   class BashOperator:
       def __init__(..., authctx=None):
           self.authctx = authctx
   
       def pre_execute(self):
           self.authctx.enter()
   
        def post_execute(self):
           self.authctx.close()
   ```
   and then users can do something like this:
   ```
   bop = BashOperator(task_id="id", ..., authctx=gcp_auth(KEY_NAME))
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615567459
 
 
   As I started thinking about it for a long time, we can create a  `get_subprocess_context_manager` method in ``hook`` and also use the ``get_hook`` method here. I'm afraid it might be overenginnering.  However, if you agree with me that we should use the composition, I can try to do it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] codecov-io edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615537382
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=h1) Report
   > Merging [#8432](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=desc) into [master](https://codecov.io/gh/apache/airflow/commit/96df427e07601e331afd6990ce7613b2026acfe0&el=desc) will **decrease** coverage by `0.00%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/8432/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff            @@
   ##           master   #8432      +/-   ##
   =========================================
   - Coverage    6.23%   6.22%   -0.01%     
   =========================================
     Files         946     950       +4     
     Lines       45661   45723      +62     
   =========================================
     Hits         2846    2846              
   - Misses      42815   42877      +62     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...rflow/example\_dags/example\_google\_bash\_operator.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9nb29nbGVfYmFzaF9vcGVyYXRvci5weQ==) | `0.00% <0.00%> (ø)` | |
   | [...dags/example\_google\_bash\_operator\_custom\_script.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9nb29nbGVfYmFzaF9vcGVyYXRvcl9jdXN0b21fc2NyaXB0LnB5) | `0.00% <0.00%> (ø)` | |
   | [...low/example\_dags/example\_google\_python\_operator.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9nb29nbGVfcHl0aG9uX29wZXJhdG9yLnB5) | `0.00% <0.00%> (ø)` | |
   | [airflow/operators/bash.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvYmFzaC5weQ==) | `0.00% <0.00%> (ø)` | |
   | [airflow/operators/python.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvcHl0aG9uLnB5) | `0.00% <0.00%> (ø)` | |
   | [airflow/utils/documentation.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9kb2N1bWVudGF0aW9uLnB5) | `0.00% <0.00%> (ø)` | |
   | [airflow/www/app.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvYXBwLnB5) | `0.00% <0.00%> (ø)` | |
   | ... and [1 more](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=footer). Last update [96df427...235382f](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] codecov-io edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615537382
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=h1) Report
   > Merging [#8432](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=desc) into [master](https://codecov.io/gh/apache/airflow/commit/96df427e07601e331afd6990ce7613b2026acfe0&el=desc) will **decrease** coverage by `0.00%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/8432/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff            @@
   ##           master   #8432      +/-   ##
   =========================================
   - Coverage    6.23%   6.22%   -0.01%     
   =========================================
     Files         946     950       +4     
     Lines       45661   45723      +62     
   =========================================
     Hits         2846    2846              
   - Misses      42815   42877      +62     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...rflow/example\_dags/example\_google\_bash\_operator.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9nb29nbGVfYmFzaF9vcGVyYXRvci5weQ==) | `0.00% <0.00%> (ø)` | |
   | [...dags/example\_google\_bash\_operator\_custom\_script.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9nb29nbGVfYmFzaF9vcGVyYXRvcl9jdXN0b21fc2NyaXB0LnB5) | `0.00% <0.00%> (ø)` | |
   | [...low/example\_dags/example\_google\_python\_operator.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9nb29nbGVfcHl0aG9uX29wZXJhdG9yLnB5) | `0.00% <0.00%> (ø)` | |
   | [airflow/operators/bash.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvYmFzaC5weQ==) | `0.00% <0.00%> (ø)` | |
   | [airflow/operators/python.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvcHl0aG9uLnB5) | `0.00% <0.00%> (ø)` | |
   | [airflow/utils/documentation.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9kb2N1bWVudGF0aW9uLnB5) | `0.00% <0.00%> (ø)` | |
   | [airflow/www/app.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvYXBwLnB5) | `0.00% <0.00%> (ø)` | |
   | ... and [1 more](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=footer). Last update [96df427...235382f](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] kaxil commented on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615884586
 
 
   Yeah I am fine if we can create something "generic"

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615560449
 
 
   @potiuk This contradicts the whole idea and the need for this operator. BashOperator and PythonOperaator are very useful because it is universal. Bash and Python are also built by compositions. New applications are installed on the system and can be used by any tool. If we inherit and make customization GCP-specific, we will limit its functionality. It will no longer be a universal operator. You will only be able to use it with one provider.  I think this is a similar problem to ``Connection.get_hook``.
   https://github.com/apache/airflow/blob/master/airflow/models/connection.py#L301
   This method is useful because it is universal and can be used regardless of the provider.
   In the future, if we need to separate the core and providers, we can extend this class with a plugin. A plugin that will add new parameters to the class. Like the get_hook method, it should use the plugin mechanism.
   
   I hope that in the future new parameters will be added for other cloud providers, e.g. AWS.
   ```python
       cross_platform_task = BashOperator(
           task_id='gcloud',
           bash_command=(
               'gsutil cp gs//bucket/a.txt a.txt && aws s3 cp test.txt s3://mybucket/test2.txt'
           ),
           gcp_conn_id=GCP_PROJECT_ID,
           aws_conn_id=AWS_PROJECT_ID,
       )
   ```
   Then it will still be a universal operator and we will not build a vendor-lock for one providers.
   
   From an architectural point of view. Here the use of inheritance will be bad, but we should composition. Inheritance will limit these operators too much.  
   I invite you to read the article.https://en.wikipedia.org/wiki/Composition_over_inheritance
   
   I will only cite one fragment.
   >Note that multiple inheritance is dangerous if not implemented carefully, as it can lead to the diamond problem. One solution to avoid this is to create classes such as **VisibleAndSolid**, **VisibleAndMovable**, **VisibleAndSolidAndMovable**, etc. for every needed combination, though this leads to a large amount of repetitive code. 
   
   If we replace some words, we have our problem.
   
   >Note that multiple inheritance is dangerous if not implemented carefully, as it can lead to the diamond problem. One solution to avoid this is to create classes such as **GoogleAndAws**, **GoogleAndAzure**, **AwsAndAzureAndGoogle**, etc. for every needed combination, though this leads to a large amount of repetitive code. 
   
   
   However, this is one of the parts that should be resolved by [AIP-8](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100827303). We do not have enough use cases yet. It will be very difficult to build abstractions if we only support GCP.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615567459
 
 
   As I started thinking about it for a long time, we can create a  `get_subprocess_context`` method and also use the ``get_hook`` method here. I'm afraid it might be overenginnering.  However, if you agree with me that we should use the composition, I can try to do it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615910786
 
 
   I will present something generic for moments, which will not require the transfer of a complex object. It's as easy to use as it is now, but it will be more generic. It will be something similar to the code below.
   ```
                       exit_stack.enter_context(  # pylint: disable=no-member
                           BaseHook.get_connection(conn).get_hook(
                               conn_id=self.conn_id,
                           ).provide_authorization()
                       )
   ```
   New services will continue to be added by composition rather than inheritance.
   
   Thank you very much for the comments. This shows that this feature can be useful.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615560449
 
 
   @potiuk This contradicts the whole idea and the need for this operator. BashOperator and PythonOperaator are very useful because it is universal. Bash and Python are also built by compositions. New applications are installed on the system and can be used by any tool. If we inherit and make customization GCP-specific, we will limit its functionality. It will no longer be a universal operator. You will only be able to use it with one provider.  I think this is a similar problem to ``Connection.get_hook``.
   https://github.com/apache/airflow/blob/master/airflow/models/connection.py#L301
   This method is useful because it is universal and can be used regardless of the provider.
   In the future, if we need to separate the core and providers, we can extend this class with a plugin. A plugin that will add new parameters to the class. Like the get_hook method, it should use the plugin mechanism.
   
   I hope that in the future new parameters will be added for other cloud providers, e.g. AWS.
   ```python
       cross_platform_task = BashOperator(
           task_id='gcloud',
           bash_command=(
               'gsutil cp gs//bucket/a.txt a.txt && aws s3 cp test.txt s3://mybucket/test2.txt'
           ),
           gcp_conn_id=GCP_PROJECT_ID,
           aws_conn_id=AWS_PROJECT_ID,
       )
   ```
   Then it will still be a universal operator and we will not build a vendor-lock for one providers.
   
   From an architectural point of view. Here the use of inheritance will be bad, but we should composition. Inheritance will limit these operators too much.  
   I invite you to read the article.https://en.wikipedia.org/wiki/Composition_over_inheritance
   
   I will only cite one fragment.
   >Note that multiple inheritance is dangerous if not implemented carefully, as it can lead to the diamond problem. One solution to avoid this is to create classes such as **VisibleAndSolid**, **VisibleAndMovable**, **VisibleAndSolidAndMovable**, etc. for every needed combination, though this leads to a large amount of repetitive code. 
   
   If we replace some words, we have our problem.
   
   >Note that multiple inheritance is dangerous if not implemented carefully, as it can lead to the diamond problem. One solution to avoid this is to create classes such as **GoogleAndAws**, **GoogleAndAzure**, **AwsAndAzureAndGoogle**, etc. for every needed combination, though this leads to a large amount of repetitive code. 
   
   
   However, this is one of the parts that should be resolved by [AIP-8](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100827303).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615910786
 
 
   I will present something generic for moments, which will not require the transfer of a complex object. It's as easy to use as it is now, but it will be more generic. It will be something similar to the code below.
   ```
                       exit_stack.enter_context(  # pylint: disable=no-member
                           BaseHook.get_connection(conn).get_hook(
                               gcp_conn_id=self.gcp_conn_id, delegate_to=self.gcp_gcp_delegate_to
                           ).provide_authorization()
                       )
   ```
   New services will continue to be added by composition rather than inheritance.
   
   Thank you very much for the comments. This shows that this feature can be useful.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615567459
 
 
   As I started thinking about it for a long time, we can create a  `get_subprocess_context` method and also use the ``get_hook`` method here. I'm afraid it might be overenginnering.  However, if you agree with me that we should use the composition, I can try to do it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] codecov-io commented on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
codecov-io commented on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615537382
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=h1) Report
   > Merging [#8432](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=desc) into [master](https://codecov.io/gh/apache/airflow/commit/96df427e07601e331afd6990ce7613b2026acfe0&el=desc) will **decrease** coverage by `0.00%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/8432/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff            @@
   ##           master   #8432      +/-   ##
   =========================================
   - Coverage    6.23%   6.22%   -0.01%     
   =========================================
     Files         946     950       +4     
     Lines       45661   45723      +62     
   =========================================
     Hits         2846    2846              
   - Misses      42815   42877      +62     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...rflow/example\_dags/example\_google\_bash\_operator.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9nb29nbGVfYmFzaF9vcGVyYXRvci5weQ==) | `0.00% <0.00%> (ø)` | |
   | [...dags/example\_google\_bash\_operator\_custom\_script.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9nb29nbGVfYmFzaF9vcGVyYXRvcl9jdXN0b21fc2NyaXB0LnB5) | `0.00% <0.00%> (ø)` | |
   | [...low/example\_dags/example\_google\_python\_operator.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9nb29nbGVfcHl0aG9uX29wZXJhdG9yLnB5) | `0.00% <0.00%> (ø)` | |
   | [airflow/operators/bash.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvYmFzaC5weQ==) | `0.00% <0.00%> (ø)` | |
   | [airflow/operators/python.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvcHl0aG9uLnB5) | `0.00% <0.00%> (ø)` | |
   | [airflow/utils/documentation.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9kb2N1bWVudGF0aW9uLnB5) | `0.00% <0.00%> (ø)` | |
   | [airflow/www/app.py](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvYXBwLnB5) | `0.00% <0.00%> (ø)` | |
   | ... and [1 more](https://codecov.io/gh/apache/airflow/pull/8432/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=footer). Last update [96df427...235382f](https://codecov.io/gh/apache/airflow/pull/8432?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj commented on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615567459
 
 
   As I started thinking about it for a long time, we can create a  `get_subprocess_context`` method and also use the ``get_hook`` method here. I'm afraid it might be overenginnering.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615560449
 
 
   @potiuk This contradicts the whole idea and the need for this operator. BashOperator and PythonOperaator are very useful because it is universal. Bash and Python are also built by compositions. New applications are installed on the system and can be used by any tool. If we inherit and make customization GCP-specific, we will limit its functionality. It will no longer be a universal operator. You will only be able to use it with one provider.  I think this is a similar problem to ``Connection.get_hook``.
   https://github.com/apache/airflow/blob/master/airflow/models/connection.py#L301
   This method is useful because it is universal and can be used regardless of the provider.
   In the future, if we need to separate the core and providers, we can extend this class with a plugin. A plugin that will add new parameters to the class. Like the get_hook method, it should use the plugin mechanism.
   
   From an architectural point of view. Here the use of inheritance will be bad, but we should composition. Inheritance will limit these operators too much.  More information: https://en.wikipedia.org/wiki/Composition_over_inheritance
   
   I hope that in the future new parameters will be added for other cloud providers, e.g. AWS.
   ```python
       cross_platform_task = BashOperator(
           task_id='gcloud',
           bash_command=(
               'gsutil cp gs//bucket/a.txt a.txt && aws s3 cp test.txt s3://mybucket/test2.txt'
           ),
           gcp_conn_id=GCP_PROJECT_ID,
           aws_conn_id=AWS_PROJECT_ID,
       )
   ```
   Then it will still be a universal operator and we will not build a vendor-lock for one providers.
   
   However, this is one of the parts that should be resolved by [AIP-8](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100827303).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] potiuk commented on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615544056
 
 
   I think it should be a gcp_bash_operator.py deriving from Bash operator and it should be in providers/google.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] kaxil edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615865246
 
 
   I agree with Jarek here, any cloud specific code in Core Operators would hurt my eyes.
   
   Each cloud provider also provides a way to authenticate using an Environment Variable too. In case of GCP it is GOOGLE_APPLICATION_CREDENTIALS which can be used for Bash and Python Operators.
   
   And if the Virtual Machine is on that Cloud Provider itself you don't need to authenticate as you can VM's credentials too.
   
   We already allow injecting env vars to BashOperator.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj commented on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615560449
 
 
   @potiuk This contradicts the whole idea and the need for this operator. BashOperator and PythonOperaator are very useful because it is universal. Bash and Python are also built by compositions. New applications are installed on the system and can be used by any tool. If we inherit and make customization GCP-specific, we will limit its functionality. It will no longer be a universal operator. You will only be able to use it with one provider.  I think this is a similar problem to ``Connection.get_hook``.
   https://github.com/apache/airflow/blob/master/airflow/models/connection.py#L301
   This method is useful because it is universal and can be used regardless of the provider.
   In the future, if we need to separate the core and providers, we can extend this class with a plugin. A plugin that will add new parameters to the class. Like the get_hook method, it should use the plugin mechanism.
   
   From an architectural point of view. Here the use of inheritance will be bad, but we should composition. Inheritance will limit these operators too much.  More information: https://en.wikipedia.org/wiki/Composition_over_inheritance
   
   I hope that in the future new parameters will be added for other cloud providers, e.g. AWS.
   ```
       cross_platform_task = BashOperator(
           task_id='gcloud',
           bash_command=(
               'gsutil cp gs//bucket/a.txt a.txt && aws s3 cp test.txt s3://mybucket/test2.txt'
           ),
           gcp_conn_id=GCP_PROJECT_ID,
           aws_conn_id=AWS_PROJECT_ID,
       )
   ```
   Then it will still be a universal operator and we will not build a vendor-lock for one providers.
   
   However, this is one of the parts that should be resolved by [AIP-8](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100827303).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] ashb commented on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615735060
 
 
   If we did this, we'd need an  AWS one, plus one for most databases, then a Spark+all of them etc.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] kaxil edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on issue #8432: Provide GCP credentials in Bash/Python operators
URL: https://github.com/apache/airflow/pull/8432#issuecomment-615865246
 
 
   I agree with Jarek here, any cloud specific code in Core Operators would hurt my eyes.
   
   Each cloud provider also provides a way to authenticate using an Environment Variable too. In case of GCP it is GOOGLE_APPLICATION_CREDENTIALS which can be used for Bash and Python Operators.
   
   And if the Virtual Machine is on that Cloud Provider itself you don't need to authenticate as you can VM's credentials too.
   
   We already allow injecting env vars to BashOperator.
   
   Users have enough flexibility in Bash and Python, is my main point. I would let them take care of this, rather than having us to maintain this code.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services