You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/03/16 14:13:58 UTC

[GitHub] [airflow] mik-laj opened a new pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials

mik-laj opened a new pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials
URL: https://github.com/apache/airflow/pull/7738
 
 
   The gcloud allows you to login to GCP only - ``gcloud auth login`` and for the needs of Application Default Credentials ``gcloud auth application-default login``.
   In our case, we want all commands to use only the credentials from ADC. so we need to configure the credentials in gcloud manually.
   
   ---
   Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [X] Description above provides context of the change
   - [X] Commit message/PR title starts with `[AIRFLOW-NNNN]`. AIRFLOW-NNNN = JIRA ID<sup>*</sup>
   - [X] Unit tests coverage for changes (not needed for documentation changes)
   - [X] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [X] Relevant documentation is updated including usage instructions.
   - [X] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   <sup>*</sup> For document-only changes commit message can start with `[AIRFLOW-XXXX]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials
URL: https://github.com/apache/airflow/pull/7738#discussion_r393522483
 
 

 ##########
 File path: airflow/providers/google/cloud/hooks/base.py
 ##########
 @@ -387,28 +391,76 @@ def provide_gcp_credential_file_as_context(self):
         It can be used to provide credentials for external programs (e.g. gcloud) that expect authorization
         file in ``GOOGLE_APPLICATION_CREDENTIALS`` environment variable.
         """
-        with tempfile.NamedTemporaryFile(mode='w+t') as conf_file:
-            key_path = self._get_field('key_path', None)  # type: Optional[str]  # noqa: E501  #  pylint: disable=protected-access
-            keyfile_dict = self._get_field('keyfile_dict', None)  # type: Optional[Dict]  # noqa: E501  # pylint: disable=protected-access
-            current_env_state = os.environ.get(CREDENTIALS)
-            try:
-                if key_path:
-                    if key_path.endswith('.p12'):
-                        raise AirflowException(
-                            'Legacy P12 key file are not supported, use a JSON key file.'
-                        )
-                    os.environ[CREDENTIALS] = key_path
-                elif keyfile_dict:
-                    conf_file.write(keyfile_dict)
-                    conf_file.flush()
-                    os.environ[CREDENTIALS] = conf_file.name
-                else:
-                    # We will use the default service account credentials.
-                    pass
-                yield conf_file
-            finally:
-                if current_env_state is None:
-                    if CREDENTIALS in os.environ:
-                        del os.environ[CREDENTIALS]
-                else:
-                    os.environ[CREDENTIALS] = current_env_state
+        key_path = self._get_field('key_path', None)  # type: Optional[str]  # noqa: E501  #  pylint: disable=protected-access
+        keyfile_dict = self._get_field('keyfile_dict', None)  # type: Optional[Dict]  # noqa: E501  # pylint: disable=protected-access
+        if key_path and keyfile_dict:
+            raise AirflowException(
+                "The `keyfile_dict` and `key_path` fields are mutually exclusive. "
+                "Please provide only one value."
+            )
+        elif key_path:
+            if key_path.endswith('.p12'):
+                raise AirflowException(
+                    'Legacy P12 key file are not supported, use a JSON key file.'
+                )
+            with patch_environ({CREDENTIALS: key_path}):
+                yield key_path
+        elif keyfile_dict:
+            with tempfile.NamedTemporaryFile(mode='w+t') as conf_file:
+                conf_file.write(keyfile_dict)
+                conf_file.flush()
+                with patch_environ({CREDENTIALS: conf_file.name}):
+                    yield conf_file.name
+        else:
+            # We will use the default service account credentials.
+            yield None
+
+    @contextmanager
+    def provide_authorized_gcloud(self):
+        """
+        Provides a separate gcloud configuration with current credentials.
+
+        The gcloud allows you to login to GCP only - ``gcloud auth login`` and
+        for the needs of Application Default Credentials ``gcloud auth application-default login``.
+        In our case, we want all commands to use only the credentials from ADCm so
+        we need to configure the credentials in gcloud manually.
+        """
+        credentials_path = _cloud_sdk.get_application_default_credentials_path()
+        project_id = self.project_id
+
+        with self.provide_gcp_credential_file_as_context(), \
+                tempfile.TemporaryDirectory() as gcloud_config_tmp, \
+                patch_environ({'CLOUDSDK_CONFIG': gcloud_config_tmp}):
+
+            # Don't display stdout/stderr for security reason
+            check_output([
+                "gcloud", "config", "set", "core/project", project_id
+            ])
+            if CREDENTIALS in os.environ:
+                # This solves most cases when we are logged in using the service key in Airflow.
+                # Don't display stdout/stderr for security reason
+                check_output([
+                    "gcloud", "auth", "activate-service-account", f"--key-file={os.environ[CREDENTIALS]}",
+                ])
+            elif os.path.exists(credentials_path):
+                # If we are logged in by `gcloud auth application-default` then we need to log in manually.
+                # This will make the `gcloud auth application-default` and `gcloud auth` credentials equals.
+                with open(credentials_path) as creds_file:
+                    creds_content = json.loads(creds_file.read())
+                    # Don't display stdout/stderr for security reason
+                    check_output([
+                        "gcloud", "config", "set", "auth/client_id", creds_content["client_id"]
+                    ])
+                    # Don't display stdout/stderr for security reason
+                    check_output([
+                        "gcloud", "config", "set", "auth/client_secret", creds_content["client_secret"]
+                    ])
+                    # Don't display stdout/stderr for security reason
+                    check_output([
+                        "gcloud",
+                        "auth",
+                        "activate-refresh-token",
+                        creds_content["client_id"],
+                        creds_content["refresh_token"],
+                    ])
 
 Review comment:
   However, GoogleSystemTest.authentication is not perfect, because this modifies user configurations, but in this PR, I want to deal with GKEStartPodOperator. In the next step, we can improve GCP authorization in tests.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] nuclearpinguin commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials

Posted by GitBox <gi...@apache.org>.
nuclearpinguin commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials
URL: https://github.com/apache/airflow/pull/7738#discussion_r393468067
 
 

 ##########
 File path: airflow/utils/process_utils.py
 ##########
 @@ -184,3 +185,26 @@ def kill_child_processes_by_pids(pids_to_kill: List[int], timeout: int = 5) -> N
             log.info("Killing child PID: %s", child.pid)
             child.kill()
             child.wait()
+
+
+@contextmanager
+def patch_environ(new_env_variables: Dict[str, str]):
 
 Review comment:
   We already have `temporary_environment_variable` in `google.utils.credentials_provider`. Let's decide to use only one function. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials
URL: https://github.com/apache/airflow/pull/7738#discussion_r393512171
 
 

 ##########
 File path: airflow/providers/google/cloud/hooks/base.py
 ##########
 @@ -387,28 +391,76 @@ def provide_gcp_credential_file_as_context(self):
         It can be used to provide credentials for external programs (e.g. gcloud) that expect authorization
         file in ``GOOGLE_APPLICATION_CREDENTIALS`` environment variable.
         """
-        with tempfile.NamedTemporaryFile(mode='w+t') as conf_file:
-            key_path = self._get_field('key_path', None)  # type: Optional[str]  # noqa: E501  #  pylint: disable=protected-access
-            keyfile_dict = self._get_field('keyfile_dict', None)  # type: Optional[Dict]  # noqa: E501  # pylint: disable=protected-access
-            current_env_state = os.environ.get(CREDENTIALS)
-            try:
-                if key_path:
-                    if key_path.endswith('.p12'):
-                        raise AirflowException(
-                            'Legacy P12 key file are not supported, use a JSON key file.'
-                        )
-                    os.environ[CREDENTIALS] = key_path
-                elif keyfile_dict:
-                    conf_file.write(keyfile_dict)
-                    conf_file.flush()
-                    os.environ[CREDENTIALS] = conf_file.name
-                else:
-                    # We will use the default service account credentials.
-                    pass
-                yield conf_file
-            finally:
-                if current_env_state is None:
-                    if CREDENTIALS in os.environ:
-                        del os.environ[CREDENTIALS]
-                else:
-                    os.environ[CREDENTIALS] = current_env_state
+        key_path = self._get_field('key_path', None)  # type: Optional[str]  # noqa: E501  #  pylint: disable=protected-access
+        keyfile_dict = self._get_field('keyfile_dict', None)  # type: Optional[Dict]  # noqa: E501  # pylint: disable=protected-access
+        if key_path and keyfile_dict:
+            raise AirflowException(
+                "The `keyfile_dict` and `key_path` fields are mutually exclusive. "
+                "Please provide only one value."
+            )
+        elif key_path:
+            if key_path.endswith('.p12'):
+                raise AirflowException(
+                    'Legacy P12 key file are not supported, use a JSON key file.'
+                )
+            with patch_environ({CREDENTIALS: key_path}):
+                yield key_path
+        elif keyfile_dict:
+            with tempfile.NamedTemporaryFile(mode='w+t') as conf_file:
+                conf_file.write(keyfile_dict)
+                conf_file.flush()
+                with patch_environ({CREDENTIALS: conf_file.name}):
+                    yield conf_file.name
+        else:
+            # We will use the default service account credentials.
+            yield None
+
+    @contextmanager
+    def provide_authorized_gcloud(self):
+        """
+        Provides a separate gcloud configuration with current credentials.
+
+        The gcloud allows you to login to GCP only - ``gcloud auth login`` and
+        for the needs of Application Default Credentials ``gcloud auth application-default login``.
+        In our case, we want all commands to use only the credentials from ADCm so
+        we need to configure the credentials in gcloud manually.
+        """
+        credentials_path = _cloud_sdk.get_application_default_credentials_path()
+        project_id = self.project_id
+
+        with self.provide_gcp_credential_file_as_context(), \
+                tempfile.TemporaryDirectory() as gcloud_config_tmp, \
+                patch_environ({'CLOUDSDK_CONFIG': gcloud_config_tmp}):
+
+            # Don't display stdout/stderr for security reason
+            check_output([
+                "gcloud", "config", "set", "core/project", project_id
+            ])
+            if CREDENTIALS in os.environ:
+                # This solves most cases when we are logged in using the service key in Airflow.
+                # Don't display stdout/stderr for security reason
+                check_output([
+                    "gcloud", "auth", "activate-service-account", f"--key-file={os.environ[CREDENTIALS]}",
+                ])
+            elif os.path.exists(credentials_path):
+                # If we are logged in by `gcloud auth application-default` then we need to log in manually.
+                # This will make the `gcloud auth application-default` and `gcloud auth` credentials equals.
+                with open(credentials_path) as creds_file:
+                    creds_content = json.loads(creds_file.read())
+                    # Don't display stdout/stderr for security reason
+                    check_output([
+                        "gcloud", "config", "set", "auth/client_id", creds_content["client_id"]
+                    ])
+                    # Don't display stdout/stderr for security reason
+                    check_output([
+                        "gcloud", "config", "set", "auth/client_secret", creds_content["client_secret"]
+                    ])
+                    # Don't display stdout/stderr for security reason
+                    check_output([
+                        "gcloud",
+                        "auth",
+                        "activate-refresh-token",
+                        creds_content["client_id"],
+                        creds_content["refresh_token"],
+                    ])
+            yield
 
 Review comment:
   We don't have to, because we delete the entire configuration directory.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] nuclearpinguin commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials

Posted by GitBox <gi...@apache.org>.
nuclearpinguin commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials
URL: https://github.com/apache/airflow/pull/7738#discussion_r393469171
 
 

 ##########
 File path: airflow/providers/google/cloud/hooks/base.py
 ##########
 @@ -387,28 +391,76 @@ def provide_gcp_credential_file_as_context(self):
         It can be used to provide credentials for external programs (e.g. gcloud) that expect authorization
         file in ``GOOGLE_APPLICATION_CREDENTIALS`` environment variable.
         """
-        with tempfile.NamedTemporaryFile(mode='w+t') as conf_file:
-            key_path = self._get_field('key_path', None)  # type: Optional[str]  # noqa: E501  #  pylint: disable=protected-access
-            keyfile_dict = self._get_field('keyfile_dict', None)  # type: Optional[Dict]  # noqa: E501  # pylint: disable=protected-access
-            current_env_state = os.environ.get(CREDENTIALS)
-            try:
-                if key_path:
-                    if key_path.endswith('.p12'):
-                        raise AirflowException(
-                            'Legacy P12 key file are not supported, use a JSON key file.'
-                        )
-                    os.environ[CREDENTIALS] = key_path
-                elif keyfile_dict:
-                    conf_file.write(keyfile_dict)
-                    conf_file.flush()
-                    os.environ[CREDENTIALS] = conf_file.name
-                else:
-                    # We will use the default service account credentials.
-                    pass
-                yield conf_file
-            finally:
-                if current_env_state is None:
-                    if CREDENTIALS in os.environ:
-                        del os.environ[CREDENTIALS]
-                else:
-                    os.environ[CREDENTIALS] = current_env_state
+        key_path = self._get_field('key_path', None)  # type: Optional[str]  # noqa: E501  #  pylint: disable=protected-access
+        keyfile_dict = self._get_field('keyfile_dict', None)  # type: Optional[Dict]  # noqa: E501  # pylint: disable=protected-access
+        if key_path and keyfile_dict:
+            raise AirflowException(
+                "The `keyfile_dict` and `key_path` fields are mutually exclusive. "
+                "Please provide only one value."
+            )
+        elif key_path:
+            if key_path.endswith('.p12'):
+                raise AirflowException(
+                    'Legacy P12 key file are not supported, use a JSON key file.'
+                )
+            with patch_environ({CREDENTIALS: key_path}):
+                yield key_path
+        elif keyfile_dict:
+            with tempfile.NamedTemporaryFile(mode='w+t') as conf_file:
+                conf_file.write(keyfile_dict)
+                conf_file.flush()
+                with patch_environ({CREDENTIALS: conf_file.name}):
+                    yield conf_file.name
+        else:
+            # We will use the default service account credentials.
+            yield None
+
+    @contextmanager
+    def provide_authorized_gcloud(self):
+        """
+        Provides a separate gcloud configuration with current credentials.
+
+        The gcloud allows you to login to GCP only - ``gcloud auth login`` and
+        for the needs of Application Default Credentials ``gcloud auth application-default login``.
+        In our case, we want all commands to use only the credentials from ADCm so
+        we need to configure the credentials in gcloud manually.
+        """
+        credentials_path = _cloud_sdk.get_application_default_credentials_path()
+        project_id = self.project_id
+
+        with self.provide_gcp_credential_file_as_context(), \
+                tempfile.TemporaryDirectory() as gcloud_config_tmp, \
+                patch_environ({'CLOUDSDK_CONFIG': gcloud_config_tmp}):
+
+            # Don't display stdout/stderr for security reason
+            check_output([
+                "gcloud", "config", "set", "core/project", project_id
+            ])
+            if CREDENTIALS in os.environ:
+                # This solves most cases when we are logged in using the service key in Airflow.
+                # Don't display stdout/stderr for security reason
+                check_output([
+                    "gcloud", "auth", "activate-service-account", f"--key-file={os.environ[CREDENTIALS]}",
+                ])
+            elif os.path.exists(credentials_path):
+                # If we are logged in by `gcloud auth application-default` then we need to log in manually.
+                # This will make the `gcloud auth application-default` and `gcloud auth` credentials equals.
+                with open(credentials_path) as creds_file:
+                    creds_content = json.loads(creds_file.read())
+                    # Don't display stdout/stderr for security reason
+                    check_output([
+                        "gcloud", "config", "set", "auth/client_id", creds_content["client_id"]
+                    ])
+                    # Don't display stdout/stderr for security reason
+                    check_output([
+                        "gcloud", "config", "set", "auth/client_secret", creds_content["client_secret"]
+                    ])
+                    # Don't display stdout/stderr for security reason
+                    check_output([
+                        "gcloud",
+                        "auth",
+                        "activate-refresh-token",
+                        creds_content["client_id"],
+                        creds_content["refresh_token"],
+                    ])
 
 Review comment:
   Once we have it, do we still need `GoogleSystemTest.authentication` context manager? It seems that both do a similar thing. If that's true, we should decide on a single approach to limit duplication of functionalities. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj merged pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use connection credentials

Posted by GitBox <gi...@apache.org>.
mik-laj merged pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use connection credentials
URL: https://github.com/apache/airflow/pull/7738
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials
URL: https://github.com/apache/airflow/pull/7738#discussion_r393522483
 
 

 ##########
 File path: airflow/providers/google/cloud/hooks/base.py
 ##########
 @@ -387,28 +391,76 @@ def provide_gcp_credential_file_as_context(self):
         It can be used to provide credentials for external programs (e.g. gcloud) that expect authorization
         file in ``GOOGLE_APPLICATION_CREDENTIALS`` environment variable.
         """
-        with tempfile.NamedTemporaryFile(mode='w+t') as conf_file:
-            key_path = self._get_field('key_path', None)  # type: Optional[str]  # noqa: E501  #  pylint: disable=protected-access
-            keyfile_dict = self._get_field('keyfile_dict', None)  # type: Optional[Dict]  # noqa: E501  # pylint: disable=protected-access
-            current_env_state = os.environ.get(CREDENTIALS)
-            try:
-                if key_path:
-                    if key_path.endswith('.p12'):
-                        raise AirflowException(
-                            'Legacy P12 key file are not supported, use a JSON key file.'
-                        )
-                    os.environ[CREDENTIALS] = key_path
-                elif keyfile_dict:
-                    conf_file.write(keyfile_dict)
-                    conf_file.flush()
-                    os.environ[CREDENTIALS] = conf_file.name
-                else:
-                    # We will use the default service account credentials.
-                    pass
-                yield conf_file
-            finally:
-                if current_env_state is None:
-                    if CREDENTIALS in os.environ:
-                        del os.environ[CREDENTIALS]
-                else:
-                    os.environ[CREDENTIALS] = current_env_state
+        key_path = self._get_field('key_path', None)  # type: Optional[str]  # noqa: E501  #  pylint: disable=protected-access
+        keyfile_dict = self._get_field('keyfile_dict', None)  # type: Optional[Dict]  # noqa: E501  # pylint: disable=protected-access
+        if key_path and keyfile_dict:
+            raise AirflowException(
+                "The `keyfile_dict` and `key_path` fields are mutually exclusive. "
+                "Please provide only one value."
+            )
+        elif key_path:
+            if key_path.endswith('.p12'):
+                raise AirflowException(
+                    'Legacy P12 key file are not supported, use a JSON key file.'
+                )
+            with patch_environ({CREDENTIALS: key_path}):
+                yield key_path
+        elif keyfile_dict:
+            with tempfile.NamedTemporaryFile(mode='w+t') as conf_file:
+                conf_file.write(keyfile_dict)
+                conf_file.flush()
+                with patch_environ({CREDENTIALS: conf_file.name}):
+                    yield conf_file.name
+        else:
+            # We will use the default service account credentials.
+            yield None
+
+    @contextmanager
+    def provide_authorized_gcloud(self):
+        """
+        Provides a separate gcloud configuration with current credentials.
+
+        The gcloud allows you to login to GCP only - ``gcloud auth login`` and
+        for the needs of Application Default Credentials ``gcloud auth application-default login``.
+        In our case, we want all commands to use only the credentials from ADCm so
+        we need to configure the credentials in gcloud manually.
+        """
+        credentials_path = _cloud_sdk.get_application_default_credentials_path()
+        project_id = self.project_id
+
+        with self.provide_gcp_credential_file_as_context(), \
+                tempfile.TemporaryDirectory() as gcloud_config_tmp, \
+                patch_environ({'CLOUDSDK_CONFIG': gcloud_config_tmp}):
+
+            # Don't display stdout/stderr for security reason
+            check_output([
+                "gcloud", "config", "set", "core/project", project_id
+            ])
+            if CREDENTIALS in os.environ:
+                # This solves most cases when we are logged in using the service key in Airflow.
+                # Don't display stdout/stderr for security reason
+                check_output([
+                    "gcloud", "auth", "activate-service-account", f"--key-file={os.environ[CREDENTIALS]}",
+                ])
+            elif os.path.exists(credentials_path):
+                # If we are logged in by `gcloud auth application-default` then we need to log in manually.
+                # This will make the `gcloud auth application-default` and `gcloud auth` credentials equals.
+                with open(credentials_path) as creds_file:
+                    creds_content = json.loads(creds_file.read())
+                    # Don't display stdout/stderr for security reason
+                    check_output([
+                        "gcloud", "config", "set", "auth/client_id", creds_content["client_id"]
+                    ])
+                    # Don't display stdout/stderr for security reason
+                    check_output([
+                        "gcloud", "config", "set", "auth/client_secret", creds_content["client_secret"]
+                    ])
+                    # Don't display stdout/stderr for security reason
+                    check_output([
+                        "gcloud",
+                        "auth",
+                        "activate-refresh-token",
+                        creds_content["client_id"],
+                        creds_content["refresh_token"],
+                    ])
 
 Review comment:
   However, GoogleSystemTest.authentication is not perfect, because this modifies user configurations.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials
URL: https://github.com/apache/airflow/pull/7738#discussion_r393513815
 
 

 ##########
 File path: airflow/providers/google/cloud/hooks/base.py
 ##########
 @@ -387,28 +391,76 @@ def provide_gcp_credential_file_as_context(self):
         It can be used to provide credentials for external programs (e.g. gcloud) that expect authorization
         file in ``GOOGLE_APPLICATION_CREDENTIALS`` environment variable.
         """
-        with tempfile.NamedTemporaryFile(mode='w+t') as conf_file:
-            key_path = self._get_field('key_path', None)  # type: Optional[str]  # noqa: E501  #  pylint: disable=protected-access
-            keyfile_dict = self._get_field('keyfile_dict', None)  # type: Optional[Dict]  # noqa: E501  # pylint: disable=protected-access
-            current_env_state = os.environ.get(CREDENTIALS)
-            try:
-                if key_path:
-                    if key_path.endswith('.p12'):
-                        raise AirflowException(
-                            'Legacy P12 key file are not supported, use a JSON key file.'
-                        )
-                    os.environ[CREDENTIALS] = key_path
-                elif keyfile_dict:
-                    conf_file.write(keyfile_dict)
-                    conf_file.flush()
-                    os.environ[CREDENTIALS] = conf_file.name
-                else:
-                    # We will use the default service account credentials.
-                    pass
-                yield conf_file
-            finally:
-                if current_env_state is None:
-                    if CREDENTIALS in os.environ:
-                        del os.environ[CREDENTIALS]
-                else:
-                    os.environ[CREDENTIALS] = current_env_state
+        key_path = self._get_field('key_path', None)  # type: Optional[str]  # noqa: E501  #  pylint: disable=protected-access
+        keyfile_dict = self._get_field('keyfile_dict', None)  # type: Optional[Dict]  # noqa: E501  # pylint: disable=protected-access
+        if key_path and keyfile_dict:
+            raise AirflowException(
+                "The `keyfile_dict` and `key_path` fields are mutually exclusive. "
+                "Please provide only one value."
+            )
+        elif key_path:
+            if key_path.endswith('.p12'):
+                raise AirflowException(
+                    'Legacy P12 key file are not supported, use a JSON key file.'
+                )
+            with patch_environ({CREDENTIALS: key_path}):
+                yield key_path
+        elif keyfile_dict:
+            with tempfile.NamedTemporaryFile(mode='w+t') as conf_file:
+                conf_file.write(keyfile_dict)
+                conf_file.flush()
+                with patch_environ({CREDENTIALS: conf_file.name}):
+                    yield conf_file.name
+        else:
+            # We will use the default service account credentials.
+            yield None
+
+    @contextmanager
+    def provide_authorized_gcloud(self):
+        """
+        Provides a separate gcloud configuration with current credentials.
+
+        The gcloud allows you to login to GCP only - ``gcloud auth login`` and
+        for the needs of Application Default Credentials ``gcloud auth application-default login``.
+        In our case, we want all commands to use only the credentials from ADCm so
+        we need to configure the credentials in gcloud manually.
+        """
+        credentials_path = _cloud_sdk.get_application_default_credentials_path()
+        project_id = self.project_id
+
+        with self.provide_gcp_credential_file_as_context(), \
+                tempfile.TemporaryDirectory() as gcloud_config_tmp, \
+                patch_environ({'CLOUDSDK_CONFIG': gcloud_config_tmp}):
+
+            # Don't display stdout/stderr for security reason
+            check_output([
+                "gcloud", "config", "set", "core/project", project_id
+            ])
+            if CREDENTIALS in os.environ:
+                # This solves most cases when we are logged in using the service key in Airflow.
+                # Don't display stdout/stderr for security reason
+                check_output([
+                    "gcloud", "auth", "activate-service-account", f"--key-file={os.environ[CREDENTIALS]}",
+                ])
+            elif os.path.exists(credentials_path):
+                # If we are logged in by `gcloud auth application-default` then we need to log in manually.
+                # This will make the `gcloud auth application-default` and `gcloud auth` credentials equals.
+                with open(credentials_path) as creds_file:
+                    creds_content = json.loads(creds_file.read())
+                    # Don't display stdout/stderr for security reason
+                    check_output([
+                        "gcloud", "config", "set", "auth/client_id", creds_content["client_id"]
+                    ])
+                    # Don't display stdout/stderr for security reason
+                    check_output([
+                        "gcloud", "config", "set", "auth/client_secret", creds_content["client_secret"]
+                    ])
+                    # Don't display stdout/stderr for security reason
+                    check_output([
+                        "gcloud",
+                        "auth",
+                        "activate-refresh-token",
+                        creds_content["client_id"],
+                        creds_content["refresh_token"],
+                    ])
 
 Review comment:
   Yes. We still have to, because this method gets data from Connection. GoogleSystemTest.authentication gets configuration from the environment variable. It will be difficult to separate it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] nuclearpinguin commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials

Posted by GitBox <gi...@apache.org>.
nuclearpinguin commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials
URL: https://github.com/apache/airflow/pull/7738#discussion_r393557937
 
 

 ##########
 File path: airflow/providers/google/cloud/utils/credentials_provider.py
 ##########
 @@ -121,7 +96,7 @@ def provide_gcp_credentials(
             conf_file.flush()
             key_file_path = conf_file.name
         if key_file_path:
-            with temporary_environment_variable(CREDENTIALS, key_file_path):
 
 Review comment:
   CREDENTIALS is environment variable and was passed here as `variable_name`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] nuclearpinguin commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials

Posted by GitBox <gi...@apache.org>.
nuclearpinguin commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials
URL: https://github.com/apache/airflow/pull/7738#discussion_r393469373
 
 

 ##########
 File path: airflow/providers/google/cloud/hooks/base.py
 ##########
 @@ -387,28 +391,76 @@ def provide_gcp_credential_file_as_context(self):
         It can be used to provide credentials for external programs (e.g. gcloud) that expect authorization
         file in ``GOOGLE_APPLICATION_CREDENTIALS`` environment variable.
         """
-        with tempfile.NamedTemporaryFile(mode='w+t') as conf_file:
-            key_path = self._get_field('key_path', None)  # type: Optional[str]  # noqa: E501  #  pylint: disable=protected-access
-            keyfile_dict = self._get_field('keyfile_dict', None)  # type: Optional[Dict]  # noqa: E501  # pylint: disable=protected-access
-            current_env_state = os.environ.get(CREDENTIALS)
-            try:
-                if key_path:
-                    if key_path.endswith('.p12'):
-                        raise AirflowException(
-                            'Legacy P12 key file are not supported, use a JSON key file.'
-                        )
-                    os.environ[CREDENTIALS] = key_path
-                elif keyfile_dict:
-                    conf_file.write(keyfile_dict)
-                    conf_file.flush()
-                    os.environ[CREDENTIALS] = conf_file.name
-                else:
-                    # We will use the default service account credentials.
-                    pass
-                yield conf_file
-            finally:
-                if current_env_state is None:
-                    if CREDENTIALS in os.environ:
-                        del os.environ[CREDENTIALS]
-                else:
-                    os.environ[CREDENTIALS] = current_env_state
+        key_path = self._get_field('key_path', None)  # type: Optional[str]  # noqa: E501  #  pylint: disable=protected-access
+        keyfile_dict = self._get_field('keyfile_dict', None)  # type: Optional[Dict]  # noqa: E501  # pylint: disable=protected-access
+        if key_path and keyfile_dict:
+            raise AirflowException(
+                "The `keyfile_dict` and `key_path` fields are mutually exclusive. "
+                "Please provide only one value."
+            )
+        elif key_path:
+            if key_path.endswith('.p12'):
+                raise AirflowException(
+                    'Legacy P12 key file are not supported, use a JSON key file.'
+                )
+            with patch_environ({CREDENTIALS: key_path}):
+                yield key_path
+        elif keyfile_dict:
+            with tempfile.NamedTemporaryFile(mode='w+t') as conf_file:
+                conf_file.write(keyfile_dict)
+                conf_file.flush()
+                with patch_environ({CREDENTIALS: conf_file.name}):
+                    yield conf_file.name
+        else:
+            # We will use the default service account credentials.
+            yield None
+
+    @contextmanager
+    def provide_authorized_gcloud(self):
+        """
+        Provides a separate gcloud configuration with current credentials.
+
+        The gcloud allows you to login to GCP only - ``gcloud auth login`` and
+        for the needs of Application Default Credentials ``gcloud auth application-default login``.
+        In our case, we want all commands to use only the credentials from ADCm so
+        we need to configure the credentials in gcloud manually.
+        """
+        credentials_path = _cloud_sdk.get_application_default_credentials_path()
+        project_id = self.project_id
+
+        with self.provide_gcp_credential_file_as_context(), \
+                tempfile.TemporaryDirectory() as gcloud_config_tmp, \
+                patch_environ({'CLOUDSDK_CONFIG': gcloud_config_tmp}):
+
+            # Don't display stdout/stderr for security reason
+            check_output([
+                "gcloud", "config", "set", "core/project", project_id
+            ])
+            if CREDENTIALS in os.environ:
+                # This solves most cases when we are logged in using the service key in Airflow.
+                # Don't display stdout/stderr for security reason
+                check_output([
+                    "gcloud", "auth", "activate-service-account", f"--key-file={os.environ[CREDENTIALS]}",
+                ])
+            elif os.path.exists(credentials_path):
+                # If we are logged in by `gcloud auth application-default` then we need to log in manually.
+                # This will make the `gcloud auth application-default` and `gcloud auth` credentials equals.
+                with open(credentials_path) as creds_file:
+                    creds_content = json.loads(creds_file.read())
+                    # Don't display stdout/stderr for security reason
+                    check_output([
+                        "gcloud", "config", "set", "auth/client_id", creds_content["client_id"]
+                    ])
+                    # Don't display stdout/stderr for security reason
+                    check_output([
+                        "gcloud", "config", "set", "auth/client_secret", creds_content["client_secret"]
+                    ])
+                    # Don't display stdout/stderr for security reason
+                    check_output([
+                        "gcloud",
+                        "auth",
+                        "activate-refresh-token",
+                        creds_content["client_id"],
+                        creds_content["refresh_token"],
+                    ])
+            yield
 
 Review comment:
   Shouldn't we revoke authentication in the end?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials
URL: https://github.com/apache/airflow/pull/7738#discussion_r393520342
 
 

 ##########
 File path: airflow/providers/google/cloud/utils/credentials_provider.py
 ##########
 @@ -121,7 +96,7 @@ def provide_gcp_credentials(
             conf_file.flush()
             key_file_path = conf_file.name
         if key_file_path:
-            with temporary_environment_variable(CREDENTIALS, key_file_path):
 
 Review comment:
   I don't understand why it worked before. CREDENTIALS is a variable, and here it is passed as the name of the argument.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #7738: [AIRFLOW-7073] GKEStartPodOperator always use user credentials
URL: https://github.com/apache/airflow/pull/7738#discussion_r393519233
 
 

 ##########
 File path: airflow/utils/process_utils.py
 ##########
 @@ -184,3 +185,26 @@ def kill_child_processes_by_pids(pids_to_kill: List[int], timeout: int = 5) -> N
             log.info("Killing child PID: %s", child.pid)
             child.kill()
             child.wait()
+
+
+@contextmanager
+def patch_environ(new_env_variables: Dict[str, str]):
 
 Review comment:
   I removed  google.utils.credentials_provider.temporary_environment_variable method.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services