You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/06/16 15:59:11 UTC

[GitHub] [airflow] potiuk opened a new pull request #9333: Split-off vault hook from vault secret backend

potiuk opened a new pull request #9333:
URL: https://github.com/apache/airflow/pull/9333


   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Target Github ISSUE in description if exists
   - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441515988



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,548 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook and the SecretBackend, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        ('approle', 'github', 'gcp', 'kubernetes', 'ldap', 'token', 'userpass')
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__()
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the KV engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    Hook to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp hvac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this ``mount_point`` is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'

Review comment:
       I think the only thing that goes from connection prefix to Hook is a few of the DB hooks etc, as there are already (even in 1.10) many many mapping types that are not defined  in Connection.py list.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441491544



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,548 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes

Review comment:
       (Or if you would like it to be: is it possible to make it so the VaultHook appears first? This is the first thing you are presented with when viewing the docs for this module.)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441499250



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,548 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook and the SecretBackend, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        ('approle', 'github', 'gcp', 'kubernetes', 'ldap', 'token', 'userpass')
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__()
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the KV engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    Hook to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp hvac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this ``mount_point`` is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'

Review comment:
       BTW. Will vault:// be usable in 1.10.* ? I will have to check it since I have to add it "Connection.py". I planned to use "http:"// originally (and I think it will work even in 1.10 anyway if I add it as http://) but I am wondering if I will be able to add vault:// as connection regardless. I will try it - it makes a different for backport documentation.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441444875



##########
File path: airflow/providers/hashicorp/secrets/vault.py
##########
@@ -50,42 +46,44 @@ class VaultBackend(BaseSecretsBackend, LoggingMixin):
     would be accessible if you provide ``{"connections_path": "connections"}`` and request
     conn_id ``smtp_default``.
 
-    :param connections_path: Specifies the path of the secret to read to get Connections.
-        (default: 'connections')
+    :param connections_path: Specifies the path of the secret to read to get Connections
+        (default: 'connections').
     :type connections_path: str
-    :param variables_path: Specifies the path of the secret to read to get Variables.
-        (default: 'variables')
+    :param variables_path: Specifies the path of the secret to read to get Variables
+        (default: 'variables').
     :type variables_path: str
     :param url: Base URL for the Vault instance being addressed.
     :type url: str
-    :param auth_type: Authentication Type for Vault (one of 'token', 'ldap', 'userpass', 'approle',
-        'github', 'gcp', 'kubernetes'). Default is ``token``.
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.

Review comment:
       Thinking about this enum may be difficult to set via the `backend_kwargs` from the config file.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441463524



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,548 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook and the SecretBackend, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        ('approle', 'github', 'gcp', 'kubernetes', 'ldap', 'token', 'userpass')
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__()
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the KV engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    Hook to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp hvac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.

Review comment:
       This needs updating too




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441456817



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'
+
+        if not kv_engine_version:
+            conn_version = self.connection.extra_dejson.get("kv_engine_version")
+            try:
+                kv_engine_version = int(conn_version) if conn_version else DEFAULT_KV_ENGINE_VERSION
+            except ValueError:
+                raise VaultError(f"The version is not an int: {conn_version}. ")
+
+        if auth_type in ["approle"]:
+            if not role_id:
+                role_id = self.connection.extra_dejson.get('role_id')
+
+        if auth_type in ['github', "token"]:
+            if not token:
+                token = self.connection.extra_dejson.get('token')
+        gcp_key_path, gcp_scopes = \
+            self._get_gcp_parameters_from_connection(gcp_key_path, gcp_scopes) \
+            if auth_type == 'gcp' else (None, None)
+        kubernetes_jwt_path, kubernetes_role = \
+            self._get_kubernetes_parameters_from_connection(kubernetes_jwt_path, kubernetes_role) \
+            if auth_type == 'kubernetes' else (None, None)
+
+        url = f"{self.connection.schema}://{self.connection.host}"
+        if self.connection.port:
+            url += f":{self.connection.port}"
+
+        self.vault_client = _VaultClient(
+            url=url,
+            auth_type=auth_type,
+            mount_point=mount_point,
+            kv_engine_version=kv_engine_version,
+            token=token,
+            username=self.connection.login,
+            password=self.connection.password,
+            secret_id=self.connection.password,

Review comment:
       Ah gotcha. :+1: 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441485709



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,548 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook and the SecretBackend, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        ('approle', 'github', 'gcp', 'kubernetes', 'ldap', 'token', 'userpass')
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__()
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the KV engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    Hook to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp hvac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this ``mount_point`` is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,

Review comment:
       Indeed. Good catch.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#issuecomment-645427140


   All green - added the , and waiting for CI


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9333: Split-off vault hook from vault secret backend

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441436656



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'
+
+        if not kv_engine_version:
+            conn_version = self.connection.extra_dejson.get("kv_engine_version")
+            try:
+                kv_engine_version = int(conn_version) if conn_version else DEFAULT_KV_ENGINE_VERSION
+            except ValueError:
+                raise VaultError(f"The version is not an int: {conn_version}. ")
+
+        if auth_type in ["approle"]:
+            if not role_id:
+                role_id = self.connection.extra_dejson.get('role_id')
+
+        if auth_type in ['github', "token"]:
+            if not token:
+                token = self.connection.extra_dejson.get('token')
+        gcp_key_path, gcp_scopes = \
+            self._get_gcp_parameters_from_connection(gcp_key_path, gcp_scopes) \
+            if auth_type == 'gcp' else (None, None)
+        kubernetes_jwt_path, kubernetes_role = \
+            self._get_kubernetes_parameters_from_connection(kubernetes_jwt_path, kubernetes_role) \
+            if auth_type == 'kubernetes' else (None, None)
+
+        url = f"{self.connection.schema}://{self.connection.host}"
+        if self.connection.port:
+            url += f":{self.connection.port}"
+
+        self.vault_client = _VaultClient(
+            url=url,
+            auth_type=auth_type,
+            mount_point=mount_point,
+            kv_engine_version=kv_engine_version,
+            token=token,
+            username=self.connection.login,
+            password=self.connection.password,
+            secret_id=self.connection.password,

Review comment:
       In the hook I mapped secret_id from the password (this is only used in approle method). it makes more sense to use password from the connection - mostly because it is replaced with *********  in the UI, so storing that in extras as "secret_id" is not a good idea I think.
   
   This is explained in the Hook's docstring
   ```
       Login/Password are used as credentials:
   
           * approle: password -> secret_id
           * ldap: login -> username,   password -> password
           * userpass: login -> username, password -> password
   
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441456482



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'
+
+        if not kv_engine_version:
+            conn_version = self.connection.extra_dejson.get("kv_engine_version")
+            try:
+                kv_engine_version = int(conn_version) if conn_version else DEFAULT_KV_ENGINE_VERSION
+            except ValueError:
+                raise VaultError(f"The version is not an int: {conn_version}. ")
+
+        if auth_type in ["approle"]:
+            if not role_id:
+                role_id = self.connection.extra_dejson.get('role_id')
+
+        if auth_type in ['github', "token"]:
+            if not token:
+                token = self.connection.extra_dejson.get('token')
+        gcp_key_path, gcp_scopes = \
+            self._get_gcp_parameters_from_connection(gcp_key_path, gcp_scopes) \
+            if auth_type == 'gcp' else (None, None)
+        kubernetes_jwt_path, kubernetes_role = \
+            self._get_kubernetes_parameters_from_connection(kubernetes_jwt_path, kubernetes_role) \
+            if auth_type == 'kubernetes' else (None, None)
+
+        url = f"{self.connection.schema}://{self.connection.host}"
+        if self.connection.port:
+            url += f":{self.connection.port}"
+
+        self.vault_client = _VaultClient(
+            url=url,
+            auth_type=auth_type,
+            mount_point=mount_point,
+            kv_engine_version=kv_engine_version,
+            token=token,
+            username=self.connection.login,
+            password=self.connection.password,
+            secret_id=self.connection.password,
+            role_id=role_id,
+            kubernetes_role=kubernetes_role,
+            kubernetes_jwt_path=kubernetes_jwt_path,
+            gcp_key_path=gcp_key_path,
+            gcp_scopes=gcp_scopes,
+        )
+
+    def _get_kubernetes_parameters_from_connection(
+        self, kubernetes_jwt_path: Optional[str], kubernetes_role: Optional[str]) \
+            -> Tuple[str, Optional[str]]:
+        if not kubernetes_jwt_path:
+            kubernetes_jwt_path = self.connection.extra_dejson.get("kubernetes_jwt_path")
+            if not kubernetes_jwt_path:
+                kubernetes_jwt_path = DEFAULT_KUBERNETES_JWT_PATH
+        if not kubernetes_role:
+            kubernetes_role = self.connection.extra_dejson.get("kubernetes_role")
+        return kubernetes_jwt_path, kubernetes_role
+
+    def _get_gcp_parameters_from_connection(
+        self, gcp_key_path: Optional[str], gcp_scopes: Optional[str]) \
+            -> Tuple[Optional[str], Optional[str]]:

Review comment:
       Yeah, agreed! (That's going to be a _monster_ commit. I suggest we do it _right_ before we release 2.0, so that the release-2.0 branch has the changes before we create it)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441505662



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,548 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes

Review comment:
       No. I exclude it by moving to separated protected module. I think it's easiest.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#issuecomment-645298092


   All issues solved. Docs look good. I am rebasing #8974 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441461533



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'
+
+        if not kv_engine_version:
+            conn_version = self.connection.extra_dejson.get("kv_engine_version")
+            try:
+                kv_engine_version = int(conn_version) if conn_version else DEFAULT_KV_ENGINE_VERSION
+            except ValueError:
+                raise VaultError(f"The version is not an int: {conn_version}. ")
+
+        if auth_type in ["approle"]:
+            if not role_id:
+                role_id = self.connection.extra_dejson.get('role_id')
+
+        if auth_type in ['github', "token"]:
+            if not token:
+                token = self.connection.extra_dejson.get('token')
+        gcp_key_path, gcp_scopes = \
+            self._get_gcp_parameters_from_connection(gcp_key_path, gcp_scopes) \

Review comment:
       Do we need/want them to be in extra and constructor args? (i.e. what workload are we supporting on the hook where we want to pull _some_ fields from a connection but others from constructor args?)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441462759



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,548 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook and the SecretBackend, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in

Review comment:
       ```suggestion
       :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441483312



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'
+
+        if not kv_engine_version:
+            conn_version = self.connection.extra_dejson.get("kv_engine_version")
+            try:
+                kv_engine_version = int(conn_version) if conn_version else DEFAULT_KV_ENGINE_VERSION
+            except ValueError:
+                raise VaultError(f"The version is not an int: {conn_version}. ")
+
+        if auth_type in ["approle"]:
+            if not role_id:
+                role_id = self.connection.extra_dejson.get('role_id')
+
+        if auth_type in ['github', "token"]:
+            if not token:
+                token = self.connection.extra_dejson.get('token')
+        gcp_key_path, gcp_scopes = \
+            self._get_gcp_parameters_from_connection(gcp_key_path, gcp_scopes) \

Review comment:
       We do not need to. But if found it useful to be able to override some of the parameters - especially that this gives indication of what extras you can provide in the vault connection:
   
   ```
       The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
   ```
   
   Since this is not basic authentication information (host/user/password/port which are part of the connection) but they are put in the extras. My assumption was that any "extra" can be also added as initialisation parameter. I looked at other hooks and we do it differently in different hooks, but often when there are "extras", you can set those extras  in the hook init param (also because setting extras via connection is not straightforward if you do not add your own connection type (and then it's not backportable). So this is mostly convenience also useful when you start playing with the hook and would like to try it before having to put them in connection extras.
   
   But I have no strong opinion here. 
   
   I am fine to remove those if you think it's not good. Particularly in the next PR i will definitely remove keyfile_dict (as it contains secret information that should not be put as parameter). But gcp_key_path and gcp_scopes fall IMHO into  the same category as all other extras (kubernetes_role, kv_engine_version etc.) so we either keep them all (mostly for  convenience) or remove them all. 
   
   I am fine with both -  if you have strong opinion I am happy to follow it.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9333: Split-off vault hook from vault secret backend

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441440378



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)

Review comment:
       Good catch!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #9333: Split-off vault hook from vault secret backend

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#issuecomment-644856469


   Hey @ashb @kaxil - as discussed in #8974 I split that into two: this one is introducing Vault Hook. The second one will add new authentication methods.
   
   The _VaultClient is now clearly marked as an internal class (both by underscored name and comment in docstring. I think it makes more sense that introducing new package.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9333: Split-off vault hook from vault secret backend

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441438857



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'
+
+        if not kv_engine_version:
+            conn_version = self.connection.extra_dejson.get("kv_engine_version")
+            try:
+                kv_engine_version = int(conn_version) if conn_version else DEFAULT_KV_ENGINE_VERSION
+            except ValueError:
+                raise VaultError(f"The version is not an int: {conn_version}. ")
+
+        if auth_type in ["approle"]:
+            if not role_id:
+                role_id = self.connection.extra_dejson.get('role_id')
+
+        if auth_type in ['github', "token"]:
+            if not token:
+                token = self.connection.extra_dejson.get('token')
+        gcp_key_path, gcp_scopes = \
+            self._get_gcp_parameters_from_connection(gcp_key_path, gcp_scopes) \
+            if auth_type == 'gcp' else (None, None)
+        kubernetes_jwt_path, kubernetes_role = \
+            self._get_kubernetes_parameters_from_connection(kubernetes_jwt_path, kubernetes_role) \
+            if auth_type == 'kubernetes' else (None, None)
+
+        url = f"{self.connection.schema}://{self.connection.host}"
+        if self.connection.port:
+            url += f":{self.connection.port}"
+
+        self.vault_client = _VaultClient(
+            url=url,
+            auth_type=auth_type,
+            mount_point=mount_point,
+            kv_engine_version=kv_engine_version,
+            token=token,
+            username=self.connection.login,
+            password=self.connection.password,
+            secret_id=self.connection.password,
+            role_id=role_id,
+            kubernetes_role=kubernetes_role,
+            kubernetes_jwt_path=kubernetes_jwt_path,
+            gcp_key_path=gcp_key_path,
+            gcp_scopes=gcp_scopes,
+        )
+
+    def _get_kubernetes_parameters_from_connection(
+        self, kubernetes_jwt_path: Optional[str], kubernetes_role: Optional[str]) \
+            -> Tuple[str, Optional[str]]:
+        if not kubernetes_jwt_path:
+            kubernetes_jwt_path = self.connection.extra_dejson.get("kubernetes_jwt_path")
+            if not kubernetes_jwt_path:
+                kubernetes_jwt_path = DEFAULT_KUBERNETES_JWT_PATH
+        if not kubernetes_role:
+            kubernetes_role = self.connection.extra_dejson.get("kubernetes_role")
+        return kubernetes_jwt_path, kubernetes_role
+
+    def _get_gcp_parameters_from_connection(
+        self, gcp_key_path: Optional[str], gcp_scopes: Optional[str]) \
+            -> Tuple[Optional[str], Optional[str]]:
+        if not gcp_scopes:
+            gcp_scopes = self.connection.extra_dejson.get("gcp_scopes")
+        if not gcp_key_path:
+            gcp_key_path = self.connection.extra_dejson.get("gcp_key_path")
+        return gcp_key_path, gcp_scopes
+
+    def get_conn(self) -> hvac.Client:
+        """
+        Retrieves connection to Vault.
+
+        :rtype: hvac.Client
+        :return: connection used.
+        """
+        return self.vault_client.client
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: Path of the secret
+        :type secret_path: str
+        :param secret_version: Optional version of key to read - can only be used in case of version 2 of KV
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: Path of the secret
+        :type secret_path: str
+        :rtype: dict
+        :return: secret stored in the vault as a dictionary
+        """
+        return self.vault_client.get_secret(secret_path=secret_path, secret_version=secret_version)
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: Path to read from
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        return self.vault_client.get_secret_metadata(secret_path=secret_path)
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: Path of the secret
+        :type secret_path: str
+        :param secret_version: Optional version of key to read - can only be used in case of version 2 of KV
+        :type secret_version: int
+        :rtype: dict
+        :return: key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.

Review comment:
       Corrected and will double check.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9333: Split-off vault hook from vault secret backend

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441440627



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.

Review comment:
       Nope




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9333: Split-off vault hook from vault secret backend

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441438697



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'
+
+        if not kv_engine_version:
+            conn_version = self.connection.extra_dejson.get("kv_engine_version")
+            try:
+                kv_engine_version = int(conn_version) if conn_version else DEFAULT_KV_ENGINE_VERSION
+            except ValueError:
+                raise VaultError(f"The version is not an int: {conn_version}. ")
+
+        if auth_type in ["approle"]:
+            if not role_id:
+                role_id = self.connection.extra_dejson.get('role_id')
+
+        if auth_type in ['github', "token"]:
+            if not token:
+                token = self.connection.extra_dejson.get('token')
+        gcp_key_path, gcp_scopes = \
+            self._get_gcp_parameters_from_connection(gcp_key_path, gcp_scopes) \
+            if auth_type == 'gcp' else (None, None)
+        kubernetes_jwt_path, kubernetes_role = \
+            self._get_kubernetes_parameters_from_connection(kubernetes_jwt_path, kubernetes_role) \
+            if auth_type == 'kubernetes' else (None, None)
+
+        url = f"{self.connection.schema}://{self.connection.host}"
+        if self.connection.port:
+            url += f":{self.connection.port}"
+
+        self.vault_client = _VaultClient(
+            url=url,
+            auth_type=auth_type,
+            mount_point=mount_point,
+            kv_engine_version=kv_engine_version,
+            token=token,
+            username=self.connection.login,
+            password=self.connection.password,
+            secret_id=self.connection.password,
+            role_id=role_id,
+            kubernetes_role=kubernetes_role,
+            kubernetes_jwt_path=kubernetes_jwt_path,
+            gcp_key_path=gcp_key_path,
+            gcp_scopes=gcp_scopes,
+        )
+
+    def _get_kubernetes_parameters_from_connection(
+        self, kubernetes_jwt_path: Optional[str], kubernetes_role: Optional[str]) \
+            -> Tuple[str, Optional[str]]:
+        if not kubernetes_jwt_path:
+            kubernetes_jwt_path = self.connection.extra_dejson.get("kubernetes_jwt_path")
+            if not kubernetes_jwt_path:
+                kubernetes_jwt_path = DEFAULT_KUBERNETES_JWT_PATH
+        if not kubernetes_role:
+            kubernetes_role = self.connection.extra_dejson.get("kubernetes_role")
+        return kubernetes_jwt_path, kubernetes_role
+
+    def _get_gcp_parameters_from_connection(
+        self, gcp_key_path: Optional[str], gcp_scopes: Optional[str]) \
+            -> Tuple[Optional[str], Optional[str]]:
+        if not gcp_scopes:
+            gcp_scopes = self.connection.extra_dejson.get("gcp_scopes")
+        if not gcp_key_path:
+            gcp_key_path = self.connection.extra_dejson.get("gcp_key_path")
+        return gcp_key_path, gcp_scopes
+
+    def get_conn(self) -> hvac.Client:
+        """
+        Retrieves connection to Vault.
+
+        :rtype: hvac.Client
+        :return: connection used.
+        """
+        return self.vault_client.client
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: Path of the secret
+        :type secret_path: str
+        :param secret_version: Optional version of key to read - can only be used in case of version 2 of KV
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: Path of the secret
+        :type secret_path: str
+        :rtype: dict
+        :return: secret stored in the vault as a dictionary
+        """
+        return self.vault_client.get_secret(secret_path=secret_path, secret_version=secret_version)
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: Path to read from
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        return self.vault_client.get_secret_metadata(secret_path=secret_path)
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: Path of the secret
+        :type secret_path: str
+        :param secret_version: Optional version of key to read - can only be used in case of version 2 of KV
+        :type secret_version: int
+        :rtype: dict
+        :return: key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+
+        """
+        return self.vault_client.get_secret_including_metadata(
+            secret_path=secret_path, secret_version=secret_version)
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: Path to read from
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.

Review comment:
       Will check.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9333: Split-off vault hook from vault secret backend

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441438243



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'
+
+        if not kv_engine_version:
+            conn_version = self.connection.extra_dejson.get("kv_engine_version")
+            try:
+                kv_engine_version = int(conn_version) if conn_version else DEFAULT_KV_ENGINE_VERSION
+            except ValueError:
+                raise VaultError(f"The version is not an int: {conn_version}. ")
+
+        if auth_type in ["approle"]:
+            if not role_id:
+                role_id = self.connection.extra_dejson.get('role_id')
+
+        if auth_type in ['github', "token"]:
+            if not token:
+                token = self.connection.extra_dejson.get('token')
+        gcp_key_path, gcp_scopes = \
+            self._get_gcp_parameters_from_connection(gcp_key_path, gcp_scopes) \
+            if auth_type == 'gcp' else (None, None)
+        kubernetes_jwt_path, kubernetes_role = \
+            self._get_kubernetes_parameters_from_connection(kubernetes_jwt_path, kubernetes_role) \
+            if auth_type == 'kubernetes' else (None, None)
+
+        url = f"{self.connection.schema}://{self.connection.host}"
+        if self.connection.port:
+            url += f":{self.connection.port}"
+
+        self.vault_client = _VaultClient(
+            url=url,
+            auth_type=auth_type,
+            mount_point=mount_point,
+            kv_engine_version=kv_engine_version,
+            token=token,
+            username=self.connection.login,
+            password=self.connection.password,
+            secret_id=self.connection.password,
+            role_id=role_id,
+            kubernetes_role=kubernetes_role,
+            kubernetes_jwt_path=kubernetes_jwt_path,
+            gcp_key_path=gcp_key_path,
+            gcp_scopes=gcp_scopes,
+        )
+
+    def _get_kubernetes_parameters_from_connection(
+        self, kubernetes_jwt_path: Optional[str], kubernetes_role: Optional[str]) \
+            -> Tuple[str, Optional[str]]:
+        if not kubernetes_jwt_path:
+            kubernetes_jwt_path = self.connection.extra_dejson.get("kubernetes_jwt_path")
+            if not kubernetes_jwt_path:
+                kubernetes_jwt_path = DEFAULT_KUBERNETES_JWT_PATH
+        if not kubernetes_role:
+            kubernetes_role = self.connection.extra_dejson.get("kubernetes_role")
+        return kubernetes_jwt_path, kubernetes_role
+
+    def _get_gcp_parameters_from_connection(
+        self, gcp_key_path: Optional[str], gcp_scopes: Optional[str]) \
+            -> Tuple[Optional[str], Optional[str]]:
+        if not gcp_scopes:
+            gcp_scopes = self.connection.extra_dejson.get("gcp_scopes")
+        if not gcp_key_path:
+            gcp_key_path = self.connection.extra_dejson.get("gcp_key_path")
+        return gcp_key_path, gcp_scopes
+
+    def get_conn(self) -> hvac.Client:
+        """
+        Retrieves connection to Vault.
+
+        :rtype: hvac.Client
+        :return: connection used.
+        """
+        return self.vault_client.client
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: Path of the secret
+        :type secret_path: str
+        :param secret_version: Optional version of key to read - can only be used in case of version 2 of KV
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: Path of the secret
+        :type secret_path: str
+        :rtype: dict
+        :return: secret stored in the vault as a dictionary
+        """
+        return self.vault_client.get_secret(secret_path=secret_path, secret_version=secret_version)
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: Path to read from
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.

Review comment:
       I fixed those and will take a look at the docs generated to confirm all is ok.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441483312



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'
+
+        if not kv_engine_version:
+            conn_version = self.connection.extra_dejson.get("kv_engine_version")
+            try:
+                kv_engine_version = int(conn_version) if conn_version else DEFAULT_KV_ENGINE_VERSION
+            except ValueError:
+                raise VaultError(f"The version is not an int: {conn_version}. ")
+
+        if auth_type in ["approle"]:
+            if not role_id:
+                role_id = self.connection.extra_dejson.get('role_id')
+
+        if auth_type in ['github', "token"]:
+            if not token:
+                token = self.connection.extra_dejson.get('token')
+        gcp_key_path, gcp_scopes = \
+            self._get_gcp_parameters_from_connection(gcp_key_path, gcp_scopes) \

Review comment:
       We do not need to. But if found it useful to be able to override some of the parameters - especially that this gives indication of what extras you can provide in the vault connection:
   
   ```
       The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
   ```
   
   Since this is not basic authentication information (host/user/password/port which are part of the connection) but they are put in the extras. My assumption was that any "extra" can be also added as initialisation parameter. I looked at other hooks and we do it differently in different hooks, but usually when there are "extras", you can set those extras  in the hook init param (also because setting extras via connection is not straightforward if you do not add your own connection type (and then it's not backportable). So this is mostly convenience also useful when you start playing with the hook and would like to try it before having to put them in connection extras.
   
   But I have no strong opinion here. 
   
   I am fine to remove those if you think it's not good. Particularly in the next PR i will definitely remove keyfile_dict (as it contains secret information that should not be put as parameter). But gcp_key_path and gcp_scopes fall IMHO into  the same category as all other extras (kubernetes_role, kv_engine_version etc.) so we either keep them all (mostly for  convenience) or remove them all. 
   
   I am fine with both -  if you have strong opinion I am happy to follow it.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9333: Split-off vault hook from vault secret backend

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441415763



##########
File path: airflow/providers/hashicorp/secrets/vault.py
##########
@@ -50,42 +46,44 @@ class VaultBackend(BaseSecretsBackend, LoggingMixin):
     would be accessible if you provide ``{"connections_path": "connections"}`` and request
     conn_id ``smtp_default``.
 
-    :param connections_path: Specifies the path of the secret to read to get Connections.
-        (default: 'connections')
+    :param connections_path: Specifies the path of the secret to read to get Connections
+        (default: 'connections').
     :type connections_path: str
-    :param variables_path: Specifies the path of the secret to read to get Variables.
-        (default: 'variables')
+    :param variables_path: Specifies the path of the secret to read to get Variables
+        (default: 'variables').
     :type variables_path: str
     :param url: Base URL for the Vault instance being addressed.
     :type url: str
-    :param auth_type: Authentication Type for Vault (one of 'token', 'ldap', 'userpass', 'approle',
-        'github', 'gcp', 'kubernetes'). Default is ``token``.
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
     :type auth_type: str
-    :param mount_point: The "path" the secret engine was mounted on. (Default: ``secret``)
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
     :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
     :param token: Authentication token to include in requests sent to Vault.
         (for ``token`` and ``github`` auth_type)
     :type token: str
-    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``, default: ``2``)
-    :type kv_engine_version: int
-    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_type)
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_type).
     :type username: str
-    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_type)
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_type).
     :type password: str
-    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
     :type role_id: str
-    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
     :type kubernetes_role: str
-    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, deafult:
-        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
     :type kubernetes_jwt_path: str
-    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type)
-    :type secret_id: str
-    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
     :type gcp_key_path: str
-    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
     :type gcp_scopes: str
-    """
+   """

Review comment:
       This looks like an errant delete happened?
   
   ```suggestion
       """
   ```

##########
File path: airflow/providers/hashicorp/secrets/vault.py
##########
@@ -50,42 +46,44 @@ class VaultBackend(BaseSecretsBackend, LoggingMixin):
     would be accessible if you provide ``{"connections_path": "connections"}`` and request
     conn_id ``smtp_default``.
 
-    :param connections_path: Specifies the path of the secret to read to get Connections.
-        (default: 'connections')
+    :param connections_path: Specifies the path of the secret to read to get Connections
+        (default: 'connections').
     :type connections_path: str
-    :param variables_path: Specifies the path of the secret to read to get Variables.
-        (default: 'variables')
+    :param variables_path: Specifies the path of the secret to read to get Variables
+        (default: 'variables').
     :type variables_path: str
     :param url: Base URL for the Vault instance being addressed.
     :type url: str
-    :param auth_type: Authentication Type for Vault (one of 'token', 'ldap', 'userpass', 'approle',
-        'github', 'gcp', 'kubernetes'). Default is ``token``.
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.

Review comment:
       Do the possible values get shown in the rendered docs? If not can it please, otherwise we're needing users to source dive which isn't great.
   
   (Perhaps this means turning `VALID_AUTH_TYPES` in to `class AuthTypes(enum.enum)`? I'm not saying I need this to be an enum type, just I'd like the possible values in the rendered docs, and enum might be a way of achieving that.)

##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'
+
+        if not kv_engine_version:
+            conn_version = self.connection.extra_dejson.get("kv_engine_version")
+            try:
+                kv_engine_version = int(conn_version) if conn_version else DEFAULT_KV_ENGINE_VERSION
+            except ValueError:
+                raise VaultError(f"The version is not an int: {conn_version}. ")
+
+        if auth_type in ["approle"]:
+            if not role_id:
+                role_id = self.connection.extra_dejson.get('role_id')
+
+        if auth_type in ['github', "token"]:
+            if not token:
+                token = self.connection.extra_dejson.get('token')
+        gcp_key_path, gcp_scopes = \
+            self._get_gcp_parameters_from_connection(gcp_key_path, gcp_scopes) \
+            if auth_type == 'gcp' else (None, None)
+        kubernetes_jwt_path, kubernetes_role = \
+            self._get_kubernetes_parameters_from_connection(kubernetes_jwt_path, kubernetes_role) \
+            if auth_type == 'kubernetes' else (None, None)
+
+        url = f"{self.connection.schema}://{self.connection.host}"
+        if self.connection.port:
+            url += f":{self.connection.port}"
+
+        self.vault_client = _VaultClient(
+            url=url,
+            auth_type=auth_type,
+            mount_point=mount_point,
+            kv_engine_version=kv_engine_version,
+            token=token,
+            username=self.connection.login,
+            password=self.connection.password,
+            secret_id=self.connection.password,

Review comment:
       ```suggestion
               secret_id=self.secret_id,
   ```
   
   I think?

##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.

Review comment:
       ```suggestion
       Hook to Interact with HashiCorp Vault KeyValue Secret engine.
   ```

##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'
+
+        if not kv_engine_version:
+            conn_version = self.connection.extra_dejson.get("kv_engine_version")
+            try:
+                kv_engine_version = int(conn_version) if conn_version else DEFAULT_KV_ENGINE_VERSION
+            except ValueError:
+                raise VaultError(f"The version is not an int: {conn_version}. ")
+
+        if auth_type in ["approle"]:
+            if not role_id:
+                role_id = self.connection.extra_dejson.get('role_id')
+
+        if auth_type in ['github', "token"]:
+            if not token:
+                token = self.connection.extra_dejson.get('token')
+        gcp_key_path, gcp_scopes = \
+            self._get_gcp_parameters_from_connection(gcp_key_path, gcp_scopes) \
+            if auth_type == 'gcp' else (None, None)
+        kubernetes_jwt_path, kubernetes_role = \
+            self._get_kubernetes_parameters_from_connection(kubernetes_jwt_path, kubernetes_role) \
+            if auth_type == 'kubernetes' else (None, None)
+
+        url = f"{self.connection.schema}://{self.connection.host}"
+        if self.connection.port:
+            url += f":{self.connection.port}"
+
+        self.vault_client = _VaultClient(
+            url=url,
+            auth_type=auth_type,
+            mount_point=mount_point,
+            kv_engine_version=kv_engine_version,
+            token=token,
+            username=self.connection.login,
+            password=self.connection.password,
+            secret_id=self.connection.password,
+            role_id=role_id,
+            kubernetes_role=kubernetes_role,
+            kubernetes_jwt_path=kubernetes_jwt_path,
+            gcp_key_path=gcp_key_path,
+            gcp_scopes=gcp_scopes,
+        )
+
+    def _get_kubernetes_parameters_from_connection(
+        self, kubernetes_jwt_path: Optional[str], kubernetes_role: Optional[str]) \
+            -> Tuple[str, Optional[str]]:
+        if not kubernetes_jwt_path:
+            kubernetes_jwt_path = self.connection.extra_dejson.get("kubernetes_jwt_path")
+            if not kubernetes_jwt_path:
+                kubernetes_jwt_path = DEFAULT_KUBERNETES_JWT_PATH
+        if not kubernetes_role:
+            kubernetes_role = self.connection.extra_dejson.get("kubernetes_role")
+        return kubernetes_jwt_path, kubernetes_role
+
+    def _get_gcp_parameters_from_connection(
+        self, gcp_key_path: Optional[str], gcp_scopes: Optional[str]) \
+            -> Tuple[Optional[str], Optional[str]]:
+        if not gcp_scopes:
+            gcp_scopes = self.connection.extra_dejson.get("gcp_scopes")
+        if not gcp_key_path:
+            gcp_key_path = self.connection.extra_dejson.get("gcp_key_path")
+        return gcp_key_path, gcp_scopes
+
+    def get_conn(self) -> hvac.Client:
+        """
+        Retrieves connection to Vault.
+
+        :rtype: hvac.Client
+        :return: connection used.
+        """
+        return self.vault_client.client
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: Path of the secret
+        :type secret_path: str
+        :param secret_version: Optional version of key to read - can only be used in case of version 2 of KV
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: Path of the secret
+        :type secret_path: str
+        :rtype: dict
+        :return: secret stored in the vault as a dictionary
+        """
+        return self.vault_client.get_secret(secret_path=secret_path, secret_version=secret_version)
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: Path to read from
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        return self.vault_client.get_secret_metadata(secret_path=secret_path)
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: Path of the secret
+        :type secret_path: str
+        :param secret_version: Optional version of key to read - can only be used in case of version 2 of KV
+        :type secret_version: int
+        :rtype: dict
+        :return: key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+
+        """
+        return self.vault_client.get_secret_including_metadata(
+            secret_path=secret_path, secret_version=secret_version)
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: Path to read from
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.

Review comment:
       Same question here about how does this render.

##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'
+
+        if not kv_engine_version:
+            conn_version = self.connection.extra_dejson.get("kv_engine_version")
+            try:
+                kv_engine_version = int(conn_version) if conn_version else DEFAULT_KV_ENGINE_VERSION
+            except ValueError:
+                raise VaultError(f"The version is not an int: {conn_version}. ")
+
+        if auth_type in ["approle"]:
+            if not role_id:
+                role_id = self.connection.extra_dejson.get('role_id')
+
+        if auth_type in ['github', "token"]:
+            if not token:
+                token = self.connection.extra_dejson.get('token')
+        gcp_key_path, gcp_scopes = \
+            self._get_gcp_parameters_from_connection(gcp_key_path, gcp_scopes) \
+            if auth_type == 'gcp' else (None, None)
+        kubernetes_jwt_path, kubernetes_role = \
+            self._get_kubernetes_parameters_from_connection(kubernetes_jwt_path, kubernetes_role) \
+            if auth_type == 'kubernetes' else (None, None)
+
+        url = f"{self.connection.schema}://{self.connection.host}"
+        if self.connection.port:
+            url += f":{self.connection.port}"
+
+        self.vault_client = _VaultClient(
+            url=url,
+            auth_type=auth_type,
+            mount_point=mount_point,
+            kv_engine_version=kv_engine_version,
+            token=token,
+            username=self.connection.login,
+            password=self.connection.password,
+            secret_id=self.connection.password,
+            role_id=role_id,
+            kubernetes_role=kubernetes_role,
+            kubernetes_jwt_path=kubernetes_jwt_path,
+            gcp_key_path=gcp_key_path,
+            gcp_scopes=gcp_scopes,
+        )
+
+    def _get_kubernetes_parameters_from_connection(
+        self, kubernetes_jwt_path: Optional[str], kubernetes_role: Optional[str]) \
+            -> Tuple[str, Optional[str]]:
+        if not kubernetes_jwt_path:
+            kubernetes_jwt_path = self.connection.extra_dejson.get("kubernetes_jwt_path")
+            if not kubernetes_jwt_path:
+                kubernetes_jwt_path = DEFAULT_KUBERNETES_JWT_PATH
+        if not kubernetes_role:
+            kubernetes_role = self.connection.extra_dejson.get("kubernetes_role")
+        return kubernetes_jwt_path, kubernetes_role
+
+    def _get_gcp_parameters_from_connection(
+        self, gcp_key_path: Optional[str], gcp_scopes: Optional[str]) \
+            -> Tuple[Optional[str], Optional[str]]:
+        if not gcp_scopes:
+            gcp_scopes = self.connection.extra_dejson.get("gcp_scopes")
+        if not gcp_key_path:
+            gcp_key_path = self.connection.extra_dejson.get("gcp_key_path")
+        return gcp_key_path, gcp_scopes
+
+    def get_conn(self) -> hvac.Client:
+        """
+        Retrieves connection to Vault.
+
+        :rtype: hvac.Client
+        :return: connection used.
+        """
+        return self.vault_client.client
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: Path of the secret
+        :type secret_path: str
+        :param secret_version: Optional version of key to read - can only be used in case of version 2 of KV
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: Path of the secret
+        :type secret_path: str
+        :rtype: dict
+        :return: secret stored in the vault as a dictionary
+        """
+        return self.vault_client.get_secret(secret_path=secret_path, secret_version=secret_version)
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: Path to read from
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        return self.vault_client.get_secret_metadata(secret_path=secret_path)
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: Path of the secret
+        :type secret_path: str
+        :param secret_version: Optional version of key to read - can only be used in case of version 2 of KV
+        :type secret_version: int
+        :rtype: dict
+        :return: key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.

Review comment:
       ```suggestion
           :return: key info. This is a Dict with "data" mapping keeping secret
               and "metadata" mapping keeping metadata of the secret.
   ```
   
   I think is the correct indentation?

##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [

Review comment:
       (I'm never sure when mypy can workout the type and when we need to specify it. I'd have thought it could have guessed this case... :shrug:)

##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'
+
+        if not kv_engine_version:
+            conn_version = self.connection.extra_dejson.get("kv_engine_version")
+            try:
+                kv_engine_version = int(conn_version) if conn_version else DEFAULT_KV_ENGINE_VERSION
+            except ValueError:
+                raise VaultError(f"The version is not an int: {conn_version}. ")
+
+        if auth_type in ["approle"]:
+            if not role_id:
+                role_id = self.connection.extra_dejson.get('role_id')
+
+        if auth_type in ['github', "token"]:
+            if not token:
+                token = self.connection.extra_dejson.get('token')
+        gcp_key_path, gcp_scopes = \
+            self._get_gcp_parameters_from_connection(gcp_key_path, gcp_scopes) \
+            if auth_type == 'gcp' else (None, None)
+        kubernetes_jwt_path, kubernetes_role = \
+            self._get_kubernetes_parameters_from_connection(kubernetes_jwt_path, kubernetes_role) \
+            if auth_type == 'kubernetes' else (None, None)
+
+        url = f"{self.connection.schema}://{self.connection.host}"
+        if self.connection.port:
+            url += f":{self.connection.port}"
+
+        self.vault_client = _VaultClient(
+            url=url,
+            auth_type=auth_type,
+            mount_point=mount_point,
+            kv_engine_version=kv_engine_version,
+            token=token,
+            username=self.connection.login,
+            password=self.connection.password,
+            secret_id=self.connection.password,
+            role_id=role_id,
+            kubernetes_role=kubernetes_role,
+            kubernetes_jwt_path=kubernetes_jwt_path,
+            gcp_key_path=gcp_key_path,
+            gcp_scopes=gcp_scopes,
+        )
+
+    def _get_kubernetes_parameters_from_connection(
+        self, kubernetes_jwt_path: Optional[str], kubernetes_role: Optional[str]) \
+            -> Tuple[str, Optional[str]]:
+        if not kubernetes_jwt_path:
+            kubernetes_jwt_path = self.connection.extra_dejson.get("kubernetes_jwt_path")
+            if not kubernetes_jwt_path:
+                kubernetes_jwt_path = DEFAULT_KUBERNETES_JWT_PATH
+        if not kubernetes_role:
+            kubernetes_role = self.connection.extra_dejson.get("kubernetes_role")
+        return kubernetes_jwt_path, kubernetes_role
+
+    def _get_gcp_parameters_from_connection(
+        self, gcp_key_path: Optional[str], gcp_scopes: Optional[str]) \
+            -> Tuple[Optional[str], Optional[str]]:

Review comment:
       ```suggestion
       def _get_gcp_parameters_from_connection(
           self,
           gcp_key_path: Optional[str],
           gcp_scopes: Optional[str],
       ) -> Tuple[Optional[str], Optional[str]]:
   ```
   
   I find this much easier to read -- I try to avoid using a `\` unless I just can't find a way around it. Thoughts?

##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'
+
+        if not kv_engine_version:
+            conn_version = self.connection.extra_dejson.get("kv_engine_version")
+            try:
+                kv_engine_version = int(conn_version) if conn_version else DEFAULT_KV_ENGINE_VERSION
+            except ValueError:
+                raise VaultError(f"The version is not an int: {conn_version}. ")
+
+        if auth_type in ["approle"]:
+            if not role_id:
+                role_id = self.connection.extra_dejson.get('role_id')
+
+        if auth_type in ['github', "token"]:
+            if not token:
+                token = self.connection.extra_dejson.get('token')
+        gcp_key_path, gcp_scopes = \
+            self._get_gcp_parameters_from_connection(gcp_key_path, gcp_scopes) \
+            if auth_type == 'gcp' else (None, None)
+        kubernetes_jwt_path, kubernetes_role = \
+            self._get_kubernetes_parameters_from_connection(kubernetes_jwt_path, kubernetes_role) \
+            if auth_type == 'kubernetes' else (None, None)
+
+        url = f"{self.connection.schema}://{self.connection.host}"
+        if self.connection.port:
+            url += f":{self.connection.port}"
+
+        self.vault_client = _VaultClient(
+            url=url,
+            auth_type=auth_type,
+            mount_point=mount_point,
+            kv_engine_version=kv_engine_version,
+            token=token,
+            username=self.connection.login,
+            password=self.connection.password,
+            secret_id=self.connection.password,
+            role_id=role_id,
+            kubernetes_role=kubernetes_role,
+            kubernetes_jwt_path=kubernetes_jwt_path,
+            gcp_key_path=gcp_key_path,
+            gcp_scopes=gcp_scopes,
+        )
+
+    def _get_kubernetes_parameters_from_connection(
+        self, kubernetes_jwt_path: Optional[str], kubernetes_role: Optional[str]) \
+            -> Tuple[str, Optional[str]]:
+        if not kubernetes_jwt_path:
+            kubernetes_jwt_path = self.connection.extra_dejson.get("kubernetes_jwt_path")
+            if not kubernetes_jwt_path:
+                kubernetes_jwt_path = DEFAULT_KUBERNETES_JWT_PATH
+        if not kubernetes_role:
+            kubernetes_role = self.connection.extra_dejson.get("kubernetes_role")
+        return kubernetes_jwt_path, kubernetes_role
+
+    def _get_gcp_parameters_from_connection(
+        self, gcp_key_path: Optional[str], gcp_scopes: Optional[str]) \
+            -> Tuple[Optional[str], Optional[str]]:
+        if not gcp_scopes:
+            gcp_scopes = self.connection.extra_dejson.get("gcp_scopes")
+        if not gcp_key_path:
+            gcp_key_path = self.connection.extra_dejson.get("gcp_key_path")
+        return gcp_key_path, gcp_scopes
+
+    def get_conn(self) -> hvac.Client:
+        """
+        Retrieves connection to Vault.
+
+        :rtype: hvac.Client
+        :return: connection used.
+        """
+        return self.vault_client.client
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: Path of the secret
+        :type secret_path: str
+        :param secret_version: Optional version of key to read - can only be used in case of version 2 of KV
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: Path of the secret
+        :type secret_path: str
+        :rtype: dict
+        :return: secret stored in the vault as a dictionary
+        """
+        return self.vault_client.get_secret(secret_path=secret_path, secret_version=secret_version)
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: Path to read from
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.

Review comment:
       How does this line render in the docs?

##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'
+
+        if not kv_engine_version:
+            conn_version = self.connection.extra_dejson.get("kv_engine_version")
+            try:
+                kv_engine_version = int(conn_version) if conn_version else DEFAULT_KV_ENGINE_VERSION
+            except ValueError:
+                raise VaultError(f"The version is not an int: {conn_version}. ")
+
+        if auth_type in ["approle"]:
+            if not role_id:
+                role_id = self.connection.extra_dejson.get('role_id')
+
+        if auth_type in ['github', "token"]:
+            if not token:
+                token = self.connection.extra_dejson.get('token')
+        gcp_key_path, gcp_scopes = \
+            self._get_gcp_parameters_from_connection(gcp_key_path, gcp_scopes) \

Review comment:
       Why do we pass `gcp_key_path, gcp_scopes` in to this function (same for kube below) -- aren't they guaranteed to only ever by None as they are local variables that are assigned a value based on the return




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#issuecomment-645378386


   I think all comments handled in the last two fixups. We have now mount_point taken from the path, explicit support for vaults/vault URL schemas (on top of http/https). _VaultClient is now moved to separate package and excluded from doc. Token is taken from password.
   
   I still left "overrideable" extras in the constructor (for non-sensitive fields) as I think it's really useful for anyone trying to use the hook.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441464886



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,548 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook and the SecretBackend, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        ('approle', 'github', 'gcp', 'kubernetes', 'ldap', 'token', 'userpass')
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__()
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the KV engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    Hook to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp hvac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this ``mount_point`` is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,

Review comment:
       I may have misunderstood the case, but these feel like passwords/sensitive info




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441489366



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,548 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook and the SecretBackend, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        ('approle', 'github', 'gcp', 'kubernetes', 'ldap', 'token', 'userpass')
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__()
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the KV engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    Hook to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp hvac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this ``mount_point`` is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'

Review comment:
       Good Idea. I think it make sense especially that in the calls to Vault this is exactly how the calls are made.

##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,548 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook and the SecretBackend, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        ('approle', 'github', 'gcp', 'kubernetes', 'ldap', 'token', 'userpass')
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__()
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the KV engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    Hook to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp hvac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this ``mount_point`` is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'

Review comment:
       Good Idea. I think it makes  sense especially that in the calls to Vault this is exactly how the calls are made.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #9333: Split-off vault hook from vault secret backend

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441433990



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.

Review comment:
       ```suggestion
           Get secret value from the KV engine.
   ```

##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in

Review comment:
       ```suggestion
       authentication code reuse between the Hook and Secret, it should not be used directly in
   ```

##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.

Review comment:
       ```suggestion
       :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
            this ``mount_point`` is not used for authentication if authentication is done via a
            different engine.
   ```

##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.

Review comment:
       Do we need this "tab space", if 'yes' then we should add it to L257 too

##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)

Review comment:
       ```suggestion
           super().__init__()
   ```

##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)

Review comment:
       We don't want to pass kwargs to LogginMixin

##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:

Review comment:
       ```suggestion
       HashiCorp HVAC documentation:
   ```
   
   or
   
   ```suggestion
       HashiCorp hvac documentation:
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441457728



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,548 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook and the SecretBackend, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        ('approle', 'github', 'gcp', 'kubernetes', 'ldap', 'token', 'userpass')
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__()
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the KV engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    Hook to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp hvac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this ``mount_point`` is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,

Review comment:
       I.e maybe we want extra mappings here:
   
   ```
    * token: password -> token
    * github: password -> token
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9333: Split-off vault hook from vault secret backend

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441433136



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [

Review comment:
       It does not  hurt to be explicit in this case I think :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441461533



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'
+
+        if not kv_engine_version:
+            conn_version = self.connection.extra_dejson.get("kv_engine_version")
+            try:
+                kv_engine_version = int(conn_version) if conn_version else DEFAULT_KV_ENGINE_VERSION
+            except ValueError:
+                raise VaultError(f"The version is not an int: {conn_version}. ")
+
+        if auth_type in ["approle"]:
+            if not role_id:
+                role_id = self.connection.extra_dejson.get('role_id')
+
+        if auth_type in ['github', "token"]:
+            if not token:
+                token = self.connection.extra_dejson.get('token')
+        gcp_key_path, gcp_scopes = \
+            self._get_gcp_parameters_from_connection(gcp_key_path, gcp_scopes) \

Review comment:
       Do we need/want them to be in extra and constructor args?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9333: Split-off vault hook from vault secret backend

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441437718



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'
+
+        if not kv_engine_version:
+            conn_version = self.connection.extra_dejson.get("kv_engine_version")
+            try:
+                kv_engine_version = int(conn_version) if conn_version else DEFAULT_KV_ENGINE_VERSION
+            except ValueError:
+                raise VaultError(f"The version is not an int: {conn_version}. ")
+
+        if auth_type in ["approle"]:
+            if not role_id:
+                role_id = self.connection.extra_dejson.get('role_id')
+
+        if auth_type in ['github', "token"]:
+            if not token:
+                token = self.connection.extra_dejson.get('token')
+        gcp_key_path, gcp_scopes = \
+            self._get_gcp_parameters_from_connection(gcp_key_path, gcp_scopes) \
+            if auth_type == 'gcp' else (None, None)
+        kubernetes_jwt_path, kubernetes_role = \
+            self._get_kubernetes_parameters_from_connection(kubernetes_jwt_path, kubernetes_role) \
+            if auth_type == 'kubernetes' else (None, None)
+
+        url = f"{self.connection.schema}://{self.connection.host}"
+        if self.connection.port:
+            url += f":{self.connection.port}"
+
+        self.vault_client = _VaultClient(
+            url=url,
+            auth_type=auth_type,
+            mount_point=mount_point,
+            kv_engine_version=kv_engine_version,
+            token=token,
+            username=self.connection.login,
+            password=self.connection.password,
+            secret_id=self.connection.password,
+            role_id=role_id,
+            kubernetes_role=kubernetes_role,
+            kubernetes_jwt_path=kubernetes_jwt_path,
+            gcp_key_path=gcp_key_path,
+            gcp_scopes=gcp_scopes,
+        )
+
+    def _get_kubernetes_parameters_from_connection(
+        self, kubernetes_jwt_path: Optional[str], kubernetes_role: Optional[str]) \
+            -> Tuple[str, Optional[str]]:
+        if not kubernetes_jwt_path:
+            kubernetes_jwt_path = self.connection.extra_dejson.get("kubernetes_jwt_path")
+            if not kubernetes_jwt_path:
+                kubernetes_jwt_path = DEFAULT_KUBERNETES_JWT_PATH
+        if not kubernetes_role:
+            kubernetes_role = self.connection.extra_dejson.get("kubernetes_role")
+        return kubernetes_jwt_path, kubernetes_role
+
+    def _get_gcp_parameters_from_connection(
+        self, gcp_key_path: Optional[str], gcp_scopes: Optional[str]) \
+            -> Tuple[Optional[str], Optional[str]]:

Review comment:
       Works for me. I hope when we switch to Black or the like we will not have to worry about that :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9333: Split-off vault hook from vault secret backend

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441434142



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'
+
+        if not kv_engine_version:
+            conn_version = self.connection.extra_dejson.get("kv_engine_version")
+            try:
+                kv_engine_version = int(conn_version) if conn_version else DEFAULT_KV_ENGINE_VERSION
+            except ValueError:
+                raise VaultError(f"The version is not an int: {conn_version}. ")
+
+        if auth_type in ["approle"]:
+            if not role_id:
+                role_id = self.connection.extra_dejson.get('role_id')
+
+        if auth_type in ['github', "token"]:
+            if not token:
+                token = self.connection.extra_dejson.get('token')
+        gcp_key_path, gcp_scopes = \
+            self._get_gcp_parameters_from_connection(gcp_key_path, gcp_scopes) \

Review comment:
       Nope. They are parameter of constructor. Can be overridden.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#issuecomment-645298092


   All issues solved. Docs look good. I am reading #8974 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441463828



##########
File path: airflow/providers/hashicorp/secrets/vault.py
##########
@@ -50,40 +46,42 @@ class VaultBackend(BaseSecretsBackend, LoggingMixin):
     would be accessible if you provide ``{"connections_path": "connections"}`` and request
     conn_id ``smtp_default``.
 
-    :param connections_path: Specifies the path of the secret to read to get Connections.
-        (default: 'connections')
+    :param connections_path: Specifies the path of the secret to read to get Connections
+        (default: 'connections').
     :type connections_path: str
-    :param variables_path: Specifies the path of the secret to read to get Variables.
-        (default: 'variables')
+    :param variables_path: Specifies the path of the secret to read to get Variables
+        (default: 'variables').
     :type variables_path: str
     :param url: Base URL for the Vault instance being addressed.
     :type url: str
-    :param auth_type: Authentication Type for Vault (one of 'token', 'ldap', 'userpass', 'approle',
-        'github', 'gcp', 'kubernetes'). Default is ``token``.
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in

Review comment:
       ```suggestion
       :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441591351



##########
File path: airflow/models/connection.py
##########
@@ -173,7 +173,8 @@ class Connection(Base, LoggingMixin):
         ('tableau', 'Tableau'),
         ('kubernetes', 'Kubernetes cluster Connection'),
         ('spark', 'Spark'),
-        ('imap', 'IMAP')
+        ('imap', 'IMAP'),
+        ('vault', 'Hashicorp Vault')

Review comment:
       ```suggestion
           ('vault', 'Hashicorp Vault'),
   ```
   
   Please




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441455328



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,548 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook and the SecretBackend, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        ('approle', 'github', 'gcp', 'kubernetes', 'ldap', 'token', 'userpass')
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__()
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the KV engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    Hook to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp hvac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this ``mount_point`` is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,

Review comment:
       Oh sorry - I just noticed this:
   
   In the hook, as per our conversation in the other PR, shouldn't all of these only come from fields in the Connection extras? Token especially is a password-like thing isn't it, so we don't ever want users specifying it in source code.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9333: Split-off vault hook from vault secret backend

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441434142



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,546 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook an Secret, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    HashiCorp Vault wrapper to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp HVac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'
+
+        if not kv_engine_version:
+            conn_version = self.connection.extra_dejson.get("kv_engine_version")
+            try:
+                kv_engine_version = int(conn_version) if conn_version else DEFAULT_KV_ENGINE_VERSION
+            except ValueError:
+                raise VaultError(f"The version is not an int: {conn_version}. ")
+
+        if auth_type in ["approle"]:
+            if not role_id:
+                role_id = self.connection.extra_dejson.get('role_id')
+
+        if auth_type in ['github', "token"]:
+            if not token:
+                token = self.connection.extra_dejson.get('token')
+        gcp_key_path, gcp_scopes = \
+            self._get_gcp_parameters_from_connection(gcp_key_path, gcp_scopes) \

Review comment:
       Nope. They are parameters of constructor. Can be overridden.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #9333:
URL: https://github.com/apache/airflow/pull/9333


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441458472



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,548 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes
+    """
+    Retrieves Authenticated client from Hashicorp Vault. This is purely internal class promoting
+    authentication code reuse between the Hook and the SecretBackend, it should not be used directly in
+    Airflow DAGs. Use VaultBackend for backend integration and Hook in case you want to communicate
+    with VaultHook using standard Airflow Connection definition.
+
+    :param url: Base URL for the Vault instance being addressed.
+    :type url: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        ('approle', 'github', 'gcp', 'kubernetes', 'ldap', 'token', 'userpass')
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this mount_point is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Selects the version of the engine to run (``1`` or ``2``, default: ``2``).
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault
+        (for ``token`` and ``github`` auth_type).
+    :type token: str
+    :param username: Username for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type username: str
+    :param password: Password for Authentication (for ``ldap`` and ``userpass`` auth_types).
+    :type password: str
+    :param secret_id: Secret ID for Authentication (for ``approle`` auth_type).
+    :type secret_id: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type).
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type).
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``).
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type).
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type).
+    :type gcp_scopes: str
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        url: Optional[str] = None,
+        auth_type: str = 'token',
+        mount_point: str = "secret",
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        username: Optional[str] = None,
+        password: Optional[str] = None,
+        secret_id: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = '/var/run/secrets/kubernetes.io/serviceaccount/token',
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__()
+        if kv_engine_version and kv_engine_version not in VALID_KV_VERSIONS:
+            raise VaultError(f"The version is not supported: {kv_engine_version}. "
+                             f"It should be one of {VALID_KV_VERSIONS}")
+        if auth_type not in VALID_AUTH_TYPES:
+            raise VaultError(f"The auth_type is not supported: {auth_type}. "
+                             f"It should be one of {VALID_AUTH_TYPES}")
+        if auth_type == "token" and not token:
+            raise VaultError("The 'token' authentication type requires 'token'")
+        if auth_type == "github" and not token:
+            raise VaultError("The 'github' authentication type requires 'token'")
+        if auth_type == "approle" and not role_id:
+            raise VaultError("The 'approle' authentication type requires 'role_id'")
+        if auth_type == "kubernetes":
+            if not kubernetes_role:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_role'")
+            if not kubernetes_jwt_path:
+                raise VaultError("The 'kubernetes' authentication type requires 'kubernetes_jwt_path'")
+
+        self.kv_engine_version = kv_engine_version if kv_engine_version else 2
+        self.url = url
+        self.auth_type = auth_type
+        self.kwargs = kwargs
+        self.token = token
+        self.mount_point = mount_point
+        self.username = username
+        self.password = password
+        self.secret_id = secret_id
+        self.role_id = role_id
+        self.kubernetes_role = kubernetes_role
+        self.kubernetes_jwt_path = kubernetes_jwt_path
+        self.gcp_key_path = gcp_key_path
+        self.gcp_scopes = gcp_scopes
+
+    @cached_property
+    def client(self) -> hvac.Client:
+        """
+        Return an authenticated Hashicorp Vault client.
+
+        :rtype: hvac.Client
+        :return: Vault Client
+
+        """
+        _client = hvac.Client(url=self.url, **self.kwargs)
+        if self.auth_type == "approle":
+            self._auth_approle(_client)
+        elif self.auth_type == "gcp":
+            self._auth_gcp(_client)
+        elif self.auth_type == "github":
+            self._auth_github(_client)
+        elif self.auth_type == "kubernetes":
+            self._auth_kubernetes(_client)
+        elif self.auth_type == "ldap":
+            self._auth_ldap(_client)
+        elif self.auth_type == "token":
+            _client.token = self.token
+        elif self.auth_type == "userpass":
+            self._auth_userpass(_client)
+        else:
+            raise VaultError(f"Authentication type '{self.auth_type}' not supported")
+
+        if _client.is_authenticated():
+            return _client
+        else:
+            raise VaultError("Vault Authentication Error!")
+
+    def _auth_userpass(self, _client: hvac.Client) -> None:
+        _client.auth_userpass(username=self.username, password=self.password)
+
+    def _auth_ldap(self, _client: hvac.Client) -> None:
+        _client.auth.ldap.login(
+            username=self.username, password=self.password)
+
+    def _auth_kubernetes(self, _client: hvac.Client) -> None:
+        if not self.kubernetes_jwt_path:
+            raise VaultError("The kubernetes_jwt_path should be set here. This should not happen.")
+        with open(self.kubernetes_jwt_path) as f:
+            jwt = f.read()
+            _client.auth_kubernetes(role=self.kubernetes_role, jwt=jwt)
+
+    def _auth_github(self, _client: hvac.Client) -> None:
+        _client.auth.github.login(token=self.token)
+
+    def _auth_gcp(self, _client: hvac.Client) -> None:
+        # noinspection PyProtectedMember
+        from airflow.providers.google.cloud.utils.credentials_provider import (
+            get_credentials_and_project_id,
+            _get_scopes
+        )
+        scopes = _get_scopes(self.gcp_scopes)
+        credentials, _ = get_credentials_and_project_id(key_path=self.gcp_key_path,
+                                                        scopes=scopes)
+        _client.auth.gcp.configure(credentials=credentials)
+
+    def _auth_approle(self, _client: hvac.Client) -> None:
+        _client.auth_approle(role_id=self.role_id, secret_id=self.secret_id)
+
+    def get_secret(self, secret_path: str, secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Get secret value from the KV engine.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+        and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :return: secret stored in the vault as a dictionary
+        """
+        try:
+            if self.kv_engine_version == 1:
+                if secret_version:
+                    raise VaultError("Secret version can only be used with version 2 of the KV engine")
+                response = self.client.secrets.kv.v1.read_secret(
+                    path=secret_path, mount_point=self.mount_point)
+            else:
+                response = self.client.secrets.kv.v2.read_secret_version(
+                    path=secret_path, mount_point=self.mount_point, version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+        return_data = response["data"] if self.kv_engine_version == 1 else response["data"]["data"]
+        return return_data
+
+    def get_secret_metadata(self, secret_path: str) -> Optional[dict]:
+        """
+        Reads secret metadata (including versions) from the engine. It is only valid for KV version 2.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :rtype: dict
+        :return: secret metadata. This is a Dict containing metadata for the secret.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_metadata(
+                path=secret_path,
+                mount_point=self.mount_point)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s", secret_path, self.mount_point)
+            return None
+
+    def get_secret_including_metadata(self,
+                                      secret_path: str,
+                                      secret_version: Optional[int] = None) -> Optional[dict]:
+        """
+        Reads secret including metadata. It is only valid for KV version 2.
+
+        See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret_version: Specifies the version of Secret to return. If not set, the latest
+            version is returned. (Can only be used in case of version 2 of KV).
+        :type secret_version: int
+        :rtype: dict
+        :return: The key info. This is a Dict with "data" mapping keeping secret
+                 and "metadata" mapping keeping metadata of the secret.
+        """
+        if self.kv_engine_version == 1:
+            raise VaultError("Metadata might only be used with version 2 of the KV engine.")
+        try:
+            return self.client.secrets.kv.v2.read_secret_version(
+                path=secret_path, mount_point=self.mount_point,
+                version=secret_version)
+        except InvalidPath:
+            self.log.debug("Secret not found %s with mount point %s and version %s",
+                           secret_path, self.mount_point, secret_version)
+            return None
+
+    def create_or_update_secret(self,
+                                secret_path: str,
+                                secret: dict,
+                                method: Optional[str] = None,
+                                cas: Optional[int] = None) -> Response:
+        """
+        Creates or updates secret.
+
+        :param secret_path: The path of the secret.
+        :type secret_path: str
+        :param secret: Secret to create or update for the path specified
+        :type secret: dict
+        :param method: Optional parameter to explicitly request a POST (create) or PUT (update) request to
+            the selected kv secret engine. If no argument is provided for this parameter, hvac attempts to
+            intelligently determine which method is appropriate. Only valid for KV engine version 1
+        :type method: str
+        :param cas: Set the "cas" value to use a Check-And-Set operation. If not set the write will be
+            allowed. If set to 0 a write will only be allowed if the key doesn't exist.
+            If the index is non-zero the write will only be allowed if the key's current version
+            matches the version specified in the cas parameter. Only valid for KV engine version 2.
+        :type cas: int
+        :rtype: requests.Response
+        :return: The response of the create_or_update_secret request.
+
+                 See https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v1.html
+                 and https://hvac.readthedocs.io/en/stable/usage/secrets_engines/kv_v2.html for details.
+
+        """
+        if self.kv_engine_version == 2 and method:
+            raise VaultError("The method parameter is only valid for version 1")
+        if self.kv_engine_version == 1 and cas:
+            raise VaultError("The cas parameter is only valid for version 2")
+        if self.kv_engine_version == 1:
+            response = self.client.secrets.kv.v1.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, method=method)
+        else:
+            response = self.client.secrets.kv.v2.create_or_update_secret(
+                secret_path=secret_path, secret=secret, mount_point=self.mount_point, cas=cas)
+        return response
+
+
+class VaultHook(BaseHook):
+    """
+    Hook to Interact with HashiCorp Vault KeyValue Secret engine.
+
+    HashiCorp hvac documentation:
+       * https://hvac.readthedocs.io/en/stable/
+
+    You connect to the host specified as host in the connection. The login/password from the connection
+    are used as credentials usually and you can specify different authentication parameters
+    via init params or via corresponding extras in the connection.
+
+    The extras in the connection are named the same as the parameters (`mount_point`,'kv_engine_version' ...).
+
+    Login/Password are used as credentials:
+
+        * approle: password -> secret_id
+        * ldap: login -> username,   password -> password
+        * userpass: login -> username, password -> password
+
+    :param vault_conn_id: The id of the connection to use
+    :type vault_conn_id: str
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.
+    :type auth_type: str
+    :param mount_point: The "path" the secret engine was mounted on. Default is "secret". Note that
+         this ``mount_point`` is not used for authentication if authentication is done via a
+         different engine.
+    :type mount_point: str
+    :param kv_engine_version: Select the version of the engine to run (``1`` or ``2``). Defaults to
+          version defined in connection or ``2`` if not defined in connection.
+    :type kv_engine_version: int
+    :param token: Authentication token to include in requests sent to Vault.
+        (for ``token`` and ``github`` auth_type)
+    :type token: str
+    :param role_id: Role ID for Authentication (for ``approle`` auth_type)
+    :type role_id: str
+    :param kubernetes_role: Role for Authentication (for ``kubernetes`` auth_type)
+    :type kubernetes_role: str
+    :param kubernetes_jwt_path: Path for kubernetes jwt token (for ``kubernetes`` auth_type, default:
+        ``/var/run/secrets/kubernetes.io/serviceaccount/token``)
+    :type kubernetes_jwt_path: str
+    :param gcp_key_path: Path to GCP Credential JSON file (for ``gcp`` auth_type)
+    :type gcp_key_path: str
+    :param gcp_scopes: Comma-separated string containing GCP scopes (for ``gcp`` auth_type)
+    :type gcp_scopes: str
+
+    """
+    def __init__(  # pylint: disable=too-many-arguments
+        self,
+        vault_conn_id: str,
+        auth_type: Optional[str] = None,
+        mount_point: Optional[str] = None,
+        kv_engine_version: Optional[int] = None,
+        token: Optional[str] = None,
+        role_id: Optional[str] = None,
+        kubernetes_role: Optional[str] = None,
+        kubernetes_jwt_path: Optional[str] = None,
+        gcp_key_path: Optional[str] = None,
+        gcp_scopes: Optional[str] = None,
+    ):
+        super().__init__()
+        self.connection = self.get_connection(vault_conn_id)
+
+        if not auth_type:
+            auth_type = self.connection.extra_dejson.get('auth_type') or "token"
+
+        if not mount_point:
+            mount_point = self.connection.extra_dejson.get('mount_point')
+        if not mount_point:
+            mount_point = 'secret'

Review comment:
       Does it make sense to take the mountpoint from the "path" part of the Connection? (I think that is confusingly named as "connectin.schema")
   
   That way the connection couldbe be defined as `vault://user:p@vault/my/secrets`. WDYT?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441463329



##########
File path: airflow/providers/hashicorp/hooks/vault.py
##########
@@ -0,0 +1,548 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Hook for HashiCorp Vault"""
+from typing import List, Optional, Tuple
+
+import hvac
+from cached_property import cached_property
+from hvac.exceptions import InvalidPath, VaultError
+from requests import Response
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_KUBERNETES_JWT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'
+DEFAULT_KV_ENGINE_VERSION = 2
+
+
+VALID_KV_VERSIONS: List[int] = [1, 2]
+VALID_AUTH_TYPES: List[str] = [
+    'approle',
+    'github',
+    'gcp',
+    'kubernetes',
+    'ldap',
+    'token',
+    'userpass'
+]
+
+
+class _VaultClient(LoggingMixin):  # pylint: disable=too-many-instance-attributes

Review comment:
       This class is showing up in the rendered docs -- I don't think it should be.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9333: Add HashiCorp Vault Hook (split-out from Vault secret backend)

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441452600



##########
File path: airflow/providers/hashicorp/secrets/vault.py
##########
@@ -50,42 +46,44 @@ class VaultBackend(BaseSecretsBackend, LoggingMixin):
     would be accessible if you provide ``{"connections_path": "connections"}`` and request
     conn_id ``smtp_default``.
 
-    :param connections_path: Specifies the path of the secret to read to get Connections.
-        (default: 'connections')
+    :param connections_path: Specifies the path of the secret to read to get Connections
+        (default: 'connections').
     :type connections_path: str
-    :param variables_path: Specifies the path of the secret to read to get Variables.
-        (default: 'variables')
+    :param variables_path: Specifies the path of the secret to read to get Variables
+        (default: 'variables').
     :type variables_path: str
     :param url: Base URL for the Vault instance being addressed.
     :type url: str
-    :param auth_type: Authentication Type for Vault (one of 'token', 'ldap', 'userpass', 'approle',
-        'github', 'gcp', 'kubernetes'). Default is ``token``.
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.

Review comment:
       Yep. No enum here is better.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9333: Split-off vault hook from vault secret backend

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9333:
URL: https://github.com/apache/airflow/pull/9333#discussion_r441432408



##########
File path: airflow/providers/hashicorp/secrets/vault.py
##########
@@ -50,42 +46,44 @@ class VaultBackend(BaseSecretsBackend, LoggingMixin):
     would be accessible if you provide ``{"connections_path": "connections"}`` and request
     conn_id ``smtp_default``.
 
-    :param connections_path: Specifies the path of the secret to read to get Connections.
-        (default: 'connections')
+    :param connections_path: Specifies the path of the secret to read to get Connections
+        (default: 'connections').
     :type connections_path: str
-    :param variables_path: Specifies the path of the secret to read to get Variables.
-        (default: 'variables')
+    :param variables_path: Specifies the path of the secret to read to get Variables
+        (default: 'variables').
     :type variables_path: str
     :param url: Base URL for the Vault instance being addressed.
     :type url: str
-    :param auth_type: Authentication Type for Vault (one of 'token', 'ldap', 'userpass', 'approle',
-        'github', 'gcp', 'kubernetes'). Default is ``token``.
+    :param auth_type: Authentication Type for Vault. Default is ``token``. Available values are in
+        :py:const:`airflow.providers.hashicorp.hooks.vault.VALID_AUTH_TYPES`.

Review comment:
       I replace it with explicit list. It's better this way.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org