You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/03/04 06:51:06 UTC

[GitHub] [airflow] baolsen opened a new pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

baolsen opened a new pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619
 
 
   ---
   Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-NNNN]`. AIRFLOW-NNNN = JIRA ID<sup>*</sup>
   - [] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   <sup>*</sup> For document-only changes commit message can start with `[AIRFLOW-XXXX]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] baolsen commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
baolsen commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#discussion_r391410752
 
 

 ##########
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##########
 @@ -206,6 +214,106 @@ def _get_credentials(self, region_name):
             endpoint_url,
         )
 
+    def _assume_role(
+            self,
+            sts_client: boto3.client,
+            extra_config: dict,
+            role_arn: str,
+            assume_role_kwargs: dict):
+        if "external_id" in extra_config:  # Backwards compatibility
+            assume_role_kwargs["ExternalId"] = extra_config.get(
+                "external_id"
+            )
+        role_session_name = "Airflow_" + self.aws_conn_id
+        self.log.info(
+            "Doing sts_client.assume_role to role_arn=%s (role_session_name=%s)",
+            role_arn,
+            role_session_name,
+        )
+        return sts_client.assume_role(
+            RoleArn=role_arn,
+            RoleSessionName=role_session_name,
+            **assume_role_kwargs
+        )
+
+    def _assume_role_with_saml(
+            self,
+            sts_client: boto3.client,
+            extra_config: dict,
+            role_arn: str,
+            assume_role_kwargs: dict):
+
+        saml_config = extra_config['assume_role_with_saml']
+        principal_arn = saml_config['principal_arn']
+
+        idp_url = saml_config["idp_url"]
+        self.log.info("idp_url= %s", idp_url)
+
+        idp_request_kwargs = saml_config["idp_request_kwargs"]
+
+        idp_auth_method = saml_config['idp_auth_method']
+        if idp_auth_method == 'http_spegno_auth':
+            # requests_gssapi will need paramiko > 2.6 since you'll need
+            # 'gssapi' not 'python-gssapi' from PyPi.
+            # https://github.com/paramiko/paramiko/pull/1311
 
 Review comment:
   Added to the PR description; will quickly add it to the how-to documentation as well :)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] nuclearpinguin commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
nuclearpinguin commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#discussion_r387531756
 
 

 ##########
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##########
 @@ -156,26 +157,31 @@ def _get_credentials(self, region_name):
                         **session_kwargs
                     )
                     sts_client = sts_session.client("sts", config=self.config)
-                    # Assume role
+
                     assume_role_kwargs = dict()
                     if "assume_role_kwargs" in extra_config:
                         assume_role_kwargs = extra_config["assume_role_kwargs"]
-                    if "external_id" in extra_config:  # Backwards compatibility
-                        assume_role_kwargs["ExternalId"] = extra_config.get(
-                            "external_id"
-                        )
 
-                    role_session_name = "Airflow_" + self.aws_conn_id
-                    self.log.info(
-                        "Doing assume_role to role_arn=%s role_session_name=%s",
+                    assume_role_method = None
+                    if "assume_role_method" in extra_config:
+                        assume_role_method = extra_config['assume_role_method']
+                    self.log.info("assume_role_method=%s", assume_role_method)
+                    method = None
+                    if not assume_role_method:
+                        method = self._assume_role
+                    elif assume_role_method == 'assume_role_with_saml':
+                        method = self._assume_role_with_saml
+                    else:
+                        raise NotImplementedError(
+                            'assume_role_method=%s' % assume_role_method)
 
 Review comment:
   When I get this exception I've got no idea what went wrong.  The information should be more meaningful and probably NotImplementedError is not a best here. Also, we can us f-strings for formatting :)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] feluelle commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
feluelle commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#discussion_r390251473
 
 

 ##########
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##########
 @@ -206,6 +214,106 @@ def _get_credentials(self, region_name):
             endpoint_url,
         )
 
+    def _assume_role(
+            self,
+            sts_client: boto3.client,
+            extra_config: dict,
+            role_arn: str,
+            assume_role_kwargs: dict):
+        if "external_id" in extra_config:  # Backwards compatibility
+            assume_role_kwargs["ExternalId"] = extra_config.get(
+                "external_id"
+            )
+        role_session_name = "Airflow_" + self.aws_conn_id
+        self.log.info(
+            "Doing sts_client.assume_role to role_arn=%s (role_session_name=%s)",
+            role_arn,
+            role_session_name,
+        )
+        return sts_client.assume_role(
+            RoleArn=role_arn,
+            RoleSessionName=role_session_name,
+            **assume_role_kwargs
+        )
+
+    def _assume_role_with_saml(
+            self,
+            sts_client: boto3.client,
+            extra_config: dict,
+            role_arn: str,
+            assume_role_kwargs: dict):
+
+        saml_config = extra_config['assume_role_with_saml']
+        principal_arn = saml_config['principal_arn']
+
+        idp_url = saml_config["idp_url"]
+        self.log.info("idp_url= %s", idp_url)
+
+        idp_request_kwargs = saml_config["idp_request_kwargs"]
+
+        idp_auth_method = saml_config['idp_auth_method']
+        if idp_auth_method == 'http_spegno_auth':
+            # requests_gssapi will need paramiko > 2.6 since you'll need
+            # 'gssapi' not 'python-gssapi' from PyPi.
+            # https://github.com/paramiko/paramiko/pull/1311
 
 Review comment:
   In my opinion this would be better to have in the commit message and/or the PR description.
   
   But it's fine I guess - not that important to me.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] codecov-io edited a comment on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#issuecomment-598039648
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=h1) Report
   > Merging [#7619](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=desc) into [master](https://codecov.io/gh/apache/airflow/commit/b39468d2878554ba60863656364b4a95eda30685&el=desc) will **decrease** coverage by `0.61%`.
   > The diff coverage is `23.21%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/7619/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #7619      +/-   ##
   ==========================================
   - Coverage   86.76%   86.15%   -0.62%     
   ==========================================
     Files         897      904       +7     
     Lines       42819    43787     +968     
   ==========================================
   + Hits        37153    37725     +572     
   - Misses       5666     6062     +396     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [airflow/providers/amazon/aws/hooks/base\_aws.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvYW1hem9uL2F3cy9ob29rcy9iYXNlX2F3cy5weQ==) | `59.41% <23.21%> (-17.90%)` | :arrow_down: |
   | [...flow/providers/apache/cassandra/hooks/cassandra.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvYXBhY2hlL2Nhc3NhbmRyYS9ob29rcy9jYXNzYW5kcmEucHk=) | `21.51% <0.00%> (-72.16%)` | :arrow_down: |
   | [...w/providers/apache/hive/operators/mysql\_to\_hive.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvYXBhY2hlL2hpdmUvb3BlcmF0b3JzL215c3FsX3RvX2hpdmUucHk=) | `35.84% <0.00%> (-64.16%)` | :arrow_down: |
   | [airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==) | `44.44% <0.00%> (-55.56%)` | :arrow_down: |
   | [airflow/providers/redis/operators/redis\_publish.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvcmVkaXMvb3BlcmF0b3JzL3JlZGlzX3B1Ymxpc2gucHk=) | `50.00% <0.00%> (-50.00%)` | :arrow_down: |
   | [airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==) | `52.94% <0.00%> (-47.06%)` | :arrow_down: |
   | [airflow/providers/mongo/sensors/mongo.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvbW9uZ28vc2Vuc29ycy9tb25nby5weQ==) | `53.33% <0.00%> (-46.67%)` | :arrow_down: |
   | [airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==) | `47.18% <0.00%> (-45.08%)` | :arrow_down: |
   | [airflow/providers/mysql/operators/mysql.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvbXlzcWwvb3BlcmF0b3JzL215c3FsLnB5) | `55.00% <0.00%> (-45.00%)` | :arrow_down: |
   | [airflow/providers/redis/sensors/redis\_key.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvcmVkaXMvc2Vuc29ycy9yZWRpc19rZXkucHk=) | `61.53% <0.00%> (-38.47%)` | :arrow_down: |
   | ... and [29 more](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=footer). Last update [b39468d...2e12bf6](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] baolsen commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
baolsen commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#issuecomment-595234976
 
 
   Timed out again... Please re-restart :)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] baolsen commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
baolsen commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#issuecomment-594356623
 
 
   I am unsure how to unit test this - suggestions welcome :). 
   I have tested it against my AWS environment and IDP and it is working as intented.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] baolsen commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
baolsen commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#issuecomment-594509306
 
 
   Hey @potiuk . Sure, I'd be willing to give it a try :) Probably best to create a Slack channel... It is blocked at my work, but I can use it via mobile to keep in touch. 
   Should we add the changes to this JIRA? 
   Perhaps better to create another, just for the AWS system test scaffolding, and rebase this one later?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] baolsen edited a comment on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
baolsen edited a comment on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#issuecomment-594356623
 
 
   I am unsure how to unit test this - suggestions welcome :). 
   I have tested it against my AWS environment and IDP and it is working as intended.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] nuclearpinguin commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
nuclearpinguin commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#issuecomment-596073262
 
 
   @baolsen can you rebase onto actual master? We've got some system test failing (should not be run at all).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] baolsen commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
baolsen commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#issuecomment-598012580
 
 
   > The `_assume_role_with_saml` method looks a bit _complex_ to me, because you are importing so many different libraries and you are doing a lot of semantic blocks which are showing that it could be useful to split it - So I would probably refactor it.
   > 
   > In my opinion the whole `_get_credentials` is too _complex_. Maybe it is time to add a sub-class `AwsCredentialsExtractor` or similar where we can split this whole function into smaller pieces we also fully test.
   > 
   > But in general it really looks good to me. 👍 You added the documentation to the how-to which is great :)
   
   It is getting quite complex. I think I'd rather leave it for now, until the AWS system tests are in place and then refactoring and unit testing will be a possibility :)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] nuclearpinguin commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
nuclearpinguin commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#discussion_r387532026
 
 

 ##########
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##########
 @@ -206,6 +212,104 @@ def _get_credentials(self, region_name):
             endpoint_url,
         )
 
+    def _assume_role(
+            self,
+            sts_client,
+            extra_config,
+            role_arn,
+            assume_role_kwargs):
 
 Review comment:
   Can you please add type annotation? Those really help :)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] nuclearpinguin commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
nuclearpinguin commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#discussion_r387532784
 
 

 ##########
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##########
 @@ -206,6 +212,104 @@ def _get_credentials(self, region_name):
             endpoint_url,
         )
 
+    def _assume_role(
+            self,
+            sts_client,
+            extra_config,
+            role_arn,
+            assume_role_kwargs):
+        if "external_id" in extra_config:  # Backwards compatibility
+            assume_role_kwargs["ExternalId"] = extra_config.get(
+                "external_id"
+            )
+        role_session_name = "Airflow_" + self.aws_conn_id
+        self.log.info(
+            "Doing sts_client.assume_role to role_arn=%s (role_session_name=%s)",
+            role_arn,
+            role_session_name,
+        )
+        return sts_client.assume_role(
+            RoleArn=role_arn,
+            RoleSessionName=role_session_name,
+            **assume_role_kwargs
+        )
+
+    def _assume_role_with_saml(
+            self,
+            sts_client,
+            extra_config,
+            role_arn,
+            assume_role_kwargs):
+
+        saml_config = extra_config['assume_role_with_saml']
+        principal_arn = saml_config['principal_arn']
+
+        idp_url = saml_config["idp_url"]
+        self.log.info("idp_url= %s", idp_url)
+
+        idp_request_kwargs = saml_config["idp_request_kwargs"]
+
+        idp_auth_method = saml_config['idp_auth_method']
+        if idp_auth_method == 'http_spegno_auth':
+            # requests_gssapi will need paramiko > 2.6 since you'll need
+            # 'gssapi' not 'python-gssapi' from PyPi.
+            # https://github.com/paramiko/paramiko/pull/1311
+            import requests_gssapi
+            auth = requests_gssapi.HTTPSPNEGOAuth()
+            if 'mutual_authentication' in saml_config:
+                mutual_auth = saml_config['mutual_authentication']
+                if mutual_auth == 'REQUIRED':
+                    auth = requests_gssapi.HTTPSPNEGOAuth(
+                        requests_gssapi.REQUIRED)
+                elif mutual_auth == 'OPTIONAL':
+                    auth = requests_gssapi.HTTPSPNEGOAuth(
+                        requests_gssapi.OPTIONAL)
+                elif mutual_auth == 'DISABLED':
+                    auth = requests_gssapi.HTTPSPNEGOAuth(
+                        requests_gssapi.DISABLED)
+                else:
+                    raise NotImplementedError(
+                        'mutual_authentication=%s' % mutual_auth)
 
 Review comment:
   Same here, the exception does not help the user.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] baolsen commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
baolsen commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#discussion_r388081436
 
 

 ##########
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##########
 @@ -206,6 +212,104 @@ def _get_credentials(self, region_name):
             endpoint_url,
         )
 
+    def _assume_role(
+            self,
+            sts_client,
+            extra_config,
+            role_arn,
+            assume_role_kwargs):
+        if "external_id" in extra_config:  # Backwards compatibility
+            assume_role_kwargs["ExternalId"] = extra_config.get(
+                "external_id"
+            )
+        role_session_name = "Airflow_" + self.aws_conn_id
+        self.log.info(
+            "Doing sts_client.assume_role to role_arn=%s (role_session_name=%s)",
+            role_arn,
+            role_session_name,
+        )
+        return sts_client.assume_role(
+            RoleArn=role_arn,
+            RoleSessionName=role_session_name,
+            **assume_role_kwargs
+        )
+
+    def _assume_role_with_saml(
+            self,
+            sts_client,
+            extra_config,
+            role_arn,
+            assume_role_kwargs):
+
+        saml_config = extra_config['assume_role_with_saml']
+        principal_arn = saml_config['principal_arn']
+
+        idp_url = saml_config["idp_url"]
+        self.log.info("idp_url= %s", idp_url)
+
+        idp_request_kwargs = saml_config["idp_request_kwargs"]
+
+        idp_auth_method = saml_config['idp_auth_method']
+        if idp_auth_method == 'http_spegno_auth':
+            # requests_gssapi will need paramiko > 2.6 since you'll need
+            # 'gssapi' not 'python-gssapi' from PyPi.
+            # https://github.com/paramiko/paramiko/pull/1311
+            import requests_gssapi
+            auth = requests_gssapi.HTTPSPNEGOAuth()
+            if 'mutual_authentication' in saml_config:
+                mutual_auth = saml_config['mutual_authentication']
+                if mutual_auth == 'REQUIRED':
+                    auth = requests_gssapi.HTTPSPNEGOAuth(
+                        requests_gssapi.REQUIRED)
+                elif mutual_auth == 'OPTIONAL':
+                    auth = requests_gssapi.HTTPSPNEGOAuth(
+                        requests_gssapi.OPTIONAL)
+                elif mutual_auth == 'DISABLED':
+                    auth = requests_gssapi.HTTPSPNEGOAuth(
+                        requests_gssapi.DISABLED)
+                else:
+                    raise NotImplementedError(
+                        'mutual_authentication=%s' % mutual_auth)
+            # Query the IDP
+            import requests
+            idp_reponse = requests.get(
+                idp_url, auth=auth, **idp_request_kwargs)
+            idp_reponse.raise_for_status()
+
+            # Assist with debugging. Note: contains sensitive info!
+            xpath = saml_config['saml_response_xpath']
+            log_idp_response = 'log_idp_response' in saml_config and saml_config[
+                'log_idp_response']
+            if log_idp_response:
+                self.log.warning(
+                    'The IDP response contains sensitive information,'
+                    ' but log_idp_response is ON (%s).', log_idp_response)
+                self.log.info('idp_reponse.content= %s', idp_reponse.content)
+                self.log.info('xpath= %s', xpath)
+
+            # Extract SAML Assertion from the returned HTML / XML
+            from lxml import etree
+            xml = etree.fromstring(idp_reponse.content)
+            saml_assertion = xml.xpath(xpath)
+            if isinstance(saml_assertion, list):
+                if len(saml_assertion) == 1:
+                    saml_assertion = saml_assertion[0]
+            if not saml_assertion:
+                raise ValueError('Invalid SAML Assertion')
+        else:
+            raise NotImplementedError('idp_auth_method=%s' % idp_auth_method)
 
 Review comment:
   Also fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] potiuk commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#issuecomment-594514397
 
 
   Yeah - cool. That would be great. I will create a channel for that. I think it should be ok to proceed with that one without system tests and merge it and then add the system test as separate PR as some refactoring might be needed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] nuclearpinguin commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
nuclearpinguin commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#issuecomment-595204227
 
 
   > Build can be restarted - test timed out :)
   
   Restarted :)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] codecov-io edited a comment on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#issuecomment-598039648
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=h1) Report
   > Merging [#7619](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=desc) into [master](https://codecov.io/gh/apache/airflow/commit/b39468d2878554ba60863656364b4a95eda30685&el=desc) will **decrease** coverage by `0.61%`.
   > The diff coverage is `23.21%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/7619/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #7619      +/-   ##
   ==========================================
   - Coverage   86.76%   86.15%   -0.62%     
   ==========================================
     Files         897      904       +7     
     Lines       42819    43787     +968     
   ==========================================
   + Hits        37153    37725     +572     
   - Misses       5666     6062     +396     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [airflow/providers/amazon/aws/hooks/base\_aws.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvYW1hem9uL2F3cy9ob29rcy9iYXNlX2F3cy5weQ==) | `59.41% <23.21%> (-17.90%)` | :arrow_down: |
   | [...flow/providers/apache/cassandra/hooks/cassandra.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvYXBhY2hlL2Nhc3NhbmRyYS9ob29rcy9jYXNzYW5kcmEucHk=) | `21.51% <0.00%> (-72.16%)` | :arrow_down: |
   | [...w/providers/apache/hive/operators/mysql\_to\_hive.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvYXBhY2hlL2hpdmUvb3BlcmF0b3JzL215c3FsX3RvX2hpdmUucHk=) | `35.84% <0.00%> (-64.16%)` | :arrow_down: |
   | [airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==) | `44.44% <0.00%> (-55.56%)` | :arrow_down: |
   | [airflow/providers/redis/operators/redis\_publish.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvcmVkaXMvb3BlcmF0b3JzL3JlZGlzX3B1Ymxpc2gucHk=) | `50.00% <0.00%> (-50.00%)` | :arrow_down: |
   | [airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==) | `52.94% <0.00%> (-47.06%)` | :arrow_down: |
   | [airflow/providers/mongo/sensors/mongo.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvbW9uZ28vc2Vuc29ycy9tb25nby5weQ==) | `53.33% <0.00%> (-46.67%)` | :arrow_down: |
   | [airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==) | `47.18% <0.00%> (-45.08%)` | :arrow_down: |
   | [airflow/providers/mysql/operators/mysql.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvbXlzcWwvb3BlcmF0b3JzL215c3FsLnB5) | `55.00% <0.00%> (-45.00%)` | :arrow_down: |
   | [airflow/providers/redis/sensors/redis\_key.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvcmVkaXMvc2Vuc29ycy9yZWRpc19rZXkucHk=) | `61.53% <0.00%> (-38.47%)` | :arrow_down: |
   | ... and [29 more](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=footer). Last update [b39468d...2e12bf6](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] baolsen commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
baolsen commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#discussion_r388081255
 
 

 ##########
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##########
 @@ -206,6 +212,104 @@ def _get_credentials(self, region_name):
             endpoint_url,
         )
 
+    def _assume_role(
+            self,
+            sts_client,
+            extra_config,
+            role_arn,
+            assume_role_kwargs):
 
 Review comment:
   Thanks, done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] baolsen commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
baolsen commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#discussion_r388081420
 
 

 ##########
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##########
 @@ -206,6 +212,104 @@ def _get_credentials(self, region_name):
             endpoint_url,
         )
 
+    def _assume_role(
+            self,
+            sts_client,
+            extra_config,
+            role_arn,
+            assume_role_kwargs):
+        if "external_id" in extra_config:  # Backwards compatibility
+            assume_role_kwargs["ExternalId"] = extra_config.get(
+                "external_id"
+            )
+        role_session_name = "Airflow_" + self.aws_conn_id
+        self.log.info(
+            "Doing sts_client.assume_role to role_arn=%s (role_session_name=%s)",
+            role_arn,
+            role_session_name,
+        )
+        return sts_client.assume_role(
+            RoleArn=role_arn,
+            RoleSessionName=role_session_name,
+            **assume_role_kwargs
+        )
+
+    def _assume_role_with_saml(
+            self,
+            sts_client,
+            extra_config,
+            role_arn,
+            assume_role_kwargs):
+
+        saml_config = extra_config['assume_role_with_saml']
+        principal_arn = saml_config['principal_arn']
+
+        idp_url = saml_config["idp_url"]
+        self.log.info("idp_url= %s", idp_url)
+
+        idp_request_kwargs = saml_config["idp_request_kwargs"]
+
+        idp_auth_method = saml_config['idp_auth_method']
+        if idp_auth_method == 'http_spegno_auth':
+            # requests_gssapi will need paramiko > 2.6 since you'll need
+            # 'gssapi' not 'python-gssapi' from PyPi.
+            # https://github.com/paramiko/paramiko/pull/1311
+            import requests_gssapi
+            auth = requests_gssapi.HTTPSPNEGOAuth()
+            if 'mutual_authentication' in saml_config:
+                mutual_auth = saml_config['mutual_authentication']
+                if mutual_auth == 'REQUIRED':
+                    auth = requests_gssapi.HTTPSPNEGOAuth(
+                        requests_gssapi.REQUIRED)
+                elif mutual_auth == 'OPTIONAL':
+                    auth = requests_gssapi.HTTPSPNEGOAuth(
+                        requests_gssapi.OPTIONAL)
+                elif mutual_auth == 'DISABLED':
+                    auth = requests_gssapi.HTTPSPNEGOAuth(
+                        requests_gssapi.DISABLED)
+                else:
+                    raise NotImplementedError(
+                        'mutual_authentication=%s' % mutual_auth)
 
 Review comment:
   Fixed thanks

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] nuclearpinguin merged pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
nuclearpinguin merged pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] potiuk commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#issuecomment-594470446
 
 
   > I am unsure how to unit test this - suggestions welcome :).
   > I have tested it against my AWS environment and IDP and it is working as intended.
   
   Hello @baolsen -> I have a proposal - we have implemented a lot of "system tests" in GCP and maybe that will be the right time to try the same approach in AWS?
   
   @nuclearpinguin is just implementing some latest fixes to our "reusable" system test class and we have it already pretty well described in the documentation https://github.com/apache/airflow/blob/master/TESTING.rst#airflow-system-tests 
   
   Breeze is now very well prepared to run such system tests automatically and I also have an open discussions about backporting the "providers" packages to 1.10 (part of AIP-21) and I want to propose that one of the conditions there is to have system tests semi-automatically that are covering the provider's package functionality. And we want to fully automate it with AIP-4.
   
   Maybe we can work together - me, @nuclearpinguin , @mik-laj and you to add first AWS system tests and be able to run it automatically? Maybe we can also - with your involvement - improve it all and make it easily usable by others?
   
   It's rather straightforward - please take a look at the docs. It boils down to having an "example_dag"  that is actually runnable (providing authorisation/configuration information). The dag should perform setup/some operations/teardown  on a real external system (GCP or AWS in this case) - we have some useful helpers written to make it easy to encapsulate it in a Pytest test with @pytest.markers.system("amazon") in your case. Then it can be easily run automatically (we will just have to provide authorisation file in the form of files/airflow-breeze-config/variables.env file - with authorisation variables.
   
   It's a bit involved as those tests usually run for a long time - but at the same time you have an environment to test your changes on your own with a real system (something that you do manually now and can be automated for any future changes).
   
   What do you think? We would love to help with that! 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] nuclearpinguin commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
nuclearpinguin commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#discussion_r387532488
 
 

 ##########
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##########
 @@ -206,6 +212,104 @@ def _get_credentials(self, region_name):
             endpoint_url,
         )
 
+    def _assume_role(
+            self,
+            sts_client,
+            extra_config,
+            role_arn,
+            assume_role_kwargs):
+        if "external_id" in extra_config:  # Backwards compatibility
+            assume_role_kwargs["ExternalId"] = extra_config.get(
+                "external_id"
+            )
+        role_session_name = "Airflow_" + self.aws_conn_id
+        self.log.info(
+            "Doing sts_client.assume_role to role_arn=%s (role_session_name=%s)",
+            role_arn,
+            role_session_name,
+        )
+        return sts_client.assume_role(
+            RoleArn=role_arn,
+            RoleSessionName=role_session_name,
+            **assume_role_kwargs
+        )
+
+    def _assume_role_with_saml(
+            self,
+            sts_client,
+            extra_config,
+            role_arn,
+            assume_role_kwargs):
+
+        saml_config = extra_config['assume_role_with_saml']
+        principal_arn = saml_config['principal_arn']
+
+        idp_url = saml_config["idp_url"]
+        self.log.info("idp_url= %s", idp_url)
+
+        idp_request_kwargs = saml_config["idp_request_kwargs"]
+
+        idp_auth_method = saml_config['idp_auth_method']
+        if idp_auth_method == 'http_spegno_auth':
+            # requests_gssapi will need paramiko > 2.6 since you'll need
+            # 'gssapi' not 'python-gssapi' from PyPi.
+            # https://github.com/paramiko/paramiko/pull/1311
+            import requests_gssapi
+            auth = requests_gssapi.HTTPSPNEGOAuth()
+            if 'mutual_authentication' in saml_config:
+                mutual_auth = saml_config['mutual_authentication']
+                if mutual_auth == 'REQUIRED':
+                    auth = requests_gssapi.HTTPSPNEGOAuth(
+                        requests_gssapi.REQUIRED)
 
 Review comment:
   Is there any reason why we cannot keep this in one line? 🤔

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] baolsen commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
baolsen commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#discussion_r387577246
 
 

 ##########
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##########
 @@ -156,26 +157,31 @@ def _get_credentials(self, region_name):
                         **session_kwargs
                     )
                     sts_client = sts_session.client("sts", config=self.config)
-                    # Assume role
+
                     assume_role_kwargs = dict()
                     if "assume_role_kwargs" in extra_config:
                         assume_role_kwargs = extra_config["assume_role_kwargs"]
-                    if "external_id" in extra_config:  # Backwards compatibility
-                        assume_role_kwargs["ExternalId"] = extra_config.get(
-                            "external_id"
-                        )
 
-                    role_session_name = "Airflow_" + self.aws_conn_id
-                    self.log.info(
-                        "Doing assume_role to role_arn=%s role_session_name=%s",
+                    assume_role_method = None
+                    if "assume_role_method" in extra_config:
+                        assume_role_method = extra_config['assume_role_method']
+                    self.log.info("assume_role_method=%s", assume_role_method)
+                    method = None
+                    if not assume_role_method:
+                        method = self._assume_role
+                    elif assume_role_method == 'assume_role_with_saml':
+                        method = self._assume_role_with_saml
+                    else:
+                        raise NotImplementedError(
+                            'assume_role_method=%s' % assume_role_method)
 
 Review comment:
   Great point, thanks. What makes sense to the developer might make no sense to the user ;) I will update these cases accordingly.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] codecov-io commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
codecov-io commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#issuecomment-598039648
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=h1) Report
   > Merging [#7619](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=desc) into [master](https://codecov.io/gh/apache/airflow/commit/b39468d2878554ba60863656364b4a95eda30685&el=desc) will **decrease** coverage by `0.61%`.
   > The diff coverage is `23.21%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/7619/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #7619      +/-   ##
   ==========================================
   - Coverage   86.76%   86.15%   -0.62%     
   ==========================================
     Files         897      904       +7     
     Lines       42819    43787     +968     
   ==========================================
   + Hits        37153    37725     +572     
   - Misses       5666     6062     +396     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [airflow/providers/amazon/aws/hooks/base\_aws.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvYW1hem9uL2F3cy9ob29rcy9iYXNlX2F3cy5weQ==) | `59.41% <23.21%> (-17.90%)` | :arrow_down: |
   | [...flow/providers/apache/cassandra/hooks/cassandra.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvYXBhY2hlL2Nhc3NhbmRyYS9ob29rcy9jYXNzYW5kcmEucHk=) | `21.51% <0.00%> (-72.16%)` | :arrow_down: |
   | [...w/providers/apache/hive/operators/mysql\_to\_hive.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvYXBhY2hlL2hpdmUvb3BlcmF0b3JzL215c3FsX3RvX2hpdmUucHk=) | `35.84% <0.00%> (-64.16%)` | :arrow_down: |
   | [airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==) | `44.44% <0.00%> (-55.56%)` | :arrow_down: |
   | [airflow/providers/redis/operators/redis\_publish.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvcmVkaXMvb3BlcmF0b3JzL3JlZGlzX3B1Ymxpc2gucHk=) | `50.00% <0.00%> (-50.00%)` | :arrow_down: |
   | [airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==) | `52.94% <0.00%> (-47.06%)` | :arrow_down: |
   | [airflow/providers/mongo/sensors/mongo.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvbW9uZ28vc2Vuc29ycy9tb25nby5weQ==) | `53.33% <0.00%> (-46.67%)` | :arrow_down: |
   | [airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==) | `47.18% <0.00%> (-45.08%)` | :arrow_down: |
   | [airflow/providers/mysql/operators/mysql.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvbXlzcWwvb3BlcmF0b3JzL215c3FsLnB5) | `55.00% <0.00%> (-45.00%)` | :arrow_down: |
   | [airflow/providers/redis/sensors/redis\_key.py](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvcmVkaXMvc2Vuc29ycy9yZWRpc19rZXkucHk=) | `61.53% <0.00%> (-38.47%)` | :arrow_down: |
   | ... and [29 more](https://codecov.io/gh/apache/airflow/pull/7619/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=footer). Last update [b39468d...2e12bf6](https://codecov.io/gh/apache/airflow/pull/7619?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] nuclearpinguin commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
nuclearpinguin commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#discussion_r387532891
 
 

 ##########
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##########
 @@ -206,6 +212,104 @@ def _get_credentials(self, region_name):
             endpoint_url,
         )
 
+    def _assume_role(
+            self,
+            sts_client,
+            extra_config,
+            role_arn,
+            assume_role_kwargs):
+        if "external_id" in extra_config:  # Backwards compatibility
+            assume_role_kwargs["ExternalId"] = extra_config.get(
+                "external_id"
+            )
+        role_session_name = "Airflow_" + self.aws_conn_id
+        self.log.info(
+            "Doing sts_client.assume_role to role_arn=%s (role_session_name=%s)",
+            role_arn,
+            role_session_name,
+        )
+        return sts_client.assume_role(
+            RoleArn=role_arn,
+            RoleSessionName=role_session_name,
+            **assume_role_kwargs
+        )
+
+    def _assume_role_with_saml(
+            self,
+            sts_client,
+            extra_config,
+            role_arn,
+            assume_role_kwargs):
+
+        saml_config = extra_config['assume_role_with_saml']
+        principal_arn = saml_config['principal_arn']
+
+        idp_url = saml_config["idp_url"]
+        self.log.info("idp_url= %s", idp_url)
+
+        idp_request_kwargs = saml_config["idp_request_kwargs"]
+
+        idp_auth_method = saml_config['idp_auth_method']
+        if idp_auth_method == 'http_spegno_auth':
+            # requests_gssapi will need paramiko > 2.6 since you'll need
+            # 'gssapi' not 'python-gssapi' from PyPi.
+            # https://github.com/paramiko/paramiko/pull/1311
+            import requests_gssapi
+            auth = requests_gssapi.HTTPSPNEGOAuth()
+            if 'mutual_authentication' in saml_config:
+                mutual_auth = saml_config['mutual_authentication']
+                if mutual_auth == 'REQUIRED':
+                    auth = requests_gssapi.HTTPSPNEGOAuth(
+                        requests_gssapi.REQUIRED)
+                elif mutual_auth == 'OPTIONAL':
+                    auth = requests_gssapi.HTTPSPNEGOAuth(
+                        requests_gssapi.OPTIONAL)
+                elif mutual_auth == 'DISABLED':
+                    auth = requests_gssapi.HTTPSPNEGOAuth(
+                        requests_gssapi.DISABLED)
+                else:
+                    raise NotImplementedError(
+                        'mutual_authentication=%s' % mutual_auth)
+            # Query the IDP
+            import requests
+            idp_reponse = requests.get(
+                idp_url, auth=auth, **idp_request_kwargs)
+            idp_reponse.raise_for_status()
+
+            # Assist with debugging. Note: contains sensitive info!
+            xpath = saml_config['saml_response_xpath']
+            log_idp_response = 'log_idp_response' in saml_config and saml_config[
+                'log_idp_response']
+            if log_idp_response:
+                self.log.warning(
+                    'The IDP response contains sensitive information,'
+                    ' but log_idp_response is ON (%s).', log_idp_response)
+                self.log.info('idp_reponse.content= %s', idp_reponse.content)
+                self.log.info('xpath= %s', xpath)
+
+            # Extract SAML Assertion from the returned HTML / XML
+            from lxml import etree
+            xml = etree.fromstring(idp_reponse.content)
+            saml_assertion = xml.xpath(xpath)
+            if isinstance(saml_assertion, list):
+                if len(saml_assertion) == 1:
+                    saml_assertion = saml_assertion[0]
+            if not saml_assertion:
+                raise ValueError('Invalid SAML Assertion')
+        else:
+            raise NotImplementedError('idp_auth_method=%s' % idp_auth_method)
 
 Review comment:
   Same here.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] baolsen commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
baolsen commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#issuecomment-595032626
 
 
   Build can be restarted - test timed out :)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] baolsen commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
baolsen commented on issue #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#issuecomment-598149120
 
 
   @nuclearpinguin Ready to merge :)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] baolsen commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML

Posted by GitBox <gi...@apache.org>.
baolsen commented on a change in pull request #7619: [AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML
URL: https://github.com/apache/airflow/pull/7619#discussion_r388081377
 
 

 ##########
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##########
 @@ -206,6 +212,104 @@ def _get_credentials(self, region_name):
             endpoint_url,
         )
 
+    def _assume_role(
+            self,
+            sts_client,
+            extra_config,
+            role_arn,
+            assume_role_kwargs):
+        if "external_id" in extra_config:  # Backwards compatibility
+            assume_role_kwargs["ExternalId"] = extra_config.get(
+                "external_id"
+            )
+        role_session_name = "Airflow_" + self.aws_conn_id
+        self.log.info(
+            "Doing sts_client.assume_role to role_arn=%s (role_session_name=%s)",
+            role_arn,
+            role_session_name,
+        )
+        return sts_client.assume_role(
+            RoleArn=role_arn,
+            RoleSessionName=role_session_name,
+            **assume_role_kwargs
+        )
+
+    def _assume_role_with_saml(
+            self,
+            sts_client,
+            extra_config,
+            role_arn,
+            assume_role_kwargs):
+
+        saml_config = extra_config['assume_role_with_saml']
+        principal_arn = saml_config['principal_arn']
+
+        idp_url = saml_config["idp_url"]
+        self.log.info("idp_url= %s", idp_url)
+
+        idp_request_kwargs = saml_config["idp_request_kwargs"]
+
+        idp_auth_method = saml_config['idp_auth_method']
+        if idp_auth_method == 'http_spegno_auth':
+            # requests_gssapi will need paramiko > 2.6 since you'll need
+            # 'gssapi' not 'python-gssapi' from PyPi.
+            # https://github.com/paramiko/paramiko/pull/1311
+            import requests_gssapi
+            auth = requests_gssapi.HTTPSPNEGOAuth()
+            if 'mutual_authentication' in saml_config:
+                mutual_auth = saml_config['mutual_authentication']
+                if mutual_auth == 'REQUIRED':
+                    auth = requests_gssapi.HTTPSPNEGOAuth(
+                        requests_gssapi.REQUIRED)
 
 Review comment:
   I asked VS Code and it couldn't tell me why it put it on another line xD. Fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services