You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/07/12 20:07:29 UTC

[GitHub] [airflow] fredthomsen opened a new issue #16952: Secrets Backend Search Path Ordering/Priority

fredthomsen opened a new issue #16952:
URL: https://github.com/apache/airflow/issues/16952


   **Description**
   
   A way to set a custom secrets backend to be lower priority than the built-in `airflow.secrets.environment_variables.EnvironmentVariablesBackend` and `airflow.secrets.metastore.MetastoreBackend`.
   
   **Use case / motivation**
   When creating a our own secrets backend utilizing Secret Server, our team noticed you cannot configure the the custom backend to be a lower priority than the default secrets backends.  In certain cases, we have DAGs that write to different sets of external systems and being able to change one of those external systems easily via environment variable to test certain conditions is a very simple way to validate things, and we also have several variables that have no need of security and checking the env vars first eliminates that network call/load to a busy system.  
   
   Now as a workaround, I do realize we can have our own secrets backend check available env vars first, but this does seem a bit clunky given the current design.  
   
   The goal would be to be able to toggle a custom backend to be lower priority such that the metastore and env vars are checked first.
   
   **Are you willing to submit a PR?**
   
   Yes, definitely.
   
   **Related Issues**
   
   You could argue #16404 is slightly related.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #16952: Secrets Backend Search Path Ordering/Priority

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #16952:
URL: https://github.com/apache/airflow/issues/16952#issuecomment-882361187


   > Maybe we should ship a secret backend implementation that allows the user to pass multiple secret backends and search them in that order? Something like:
   
   I wanted to do very similar thing with Multi Executor allowing any combination of those (I even had working POC) but eventually we settled on only having CeleryKubernetesExecutor  to handle only this particular case, and I think it was a very good decision. We could focus on only making this case working for all the edge cases and even that proved to take a few releases.
   
   I know it is tempting to provide something like you said, but also it baloons the number of test cases and increases complexity and "supportability surface" immensely. I think when we get to the point where we have to add dictionary of arrays in configuration, it's already pretty bad. You also open up to all the questions when there are "support question" you will first have to understand all the potential paths there, whether None or empty are supported by each backend, and when someone adds their own custom backends to the mix with different behaviours, it might become even more complex. And the next step will be - "I want to have connections retrieved in sequence A/B and "airflow configuration" in sequence B/A". How would we configure that?
   
   This is not theorethical question -  we already had this discussion on how the whole secret backends should behave when then backend is unreachable (temporarily) https://github.com/apache/airflow/issues/14592 - where we dicussed (and agreed on) that when configuration variables are retrieved and secret backend is missing-in-action, Airflow should fail hard because it can start doing stuff that it is not intended to (due to fallback behaviour it could for example start using a wrong database). Those discussions will only increase in complexity if we allow multi-backend to be officially supported by Airflow.
   
   I think it's also more in-line with "philosophy" of Airlfow. I think we already give the users a chance to do this - they can write their own "aggregated" backend  - with less configuration woes, specifically targetting their cases and they can fine tune it's behaviour much better by writing a few lines of python code rather than describing how different python classes are connected via configuration. The whole premise of Airflow is that you are supposed to extend it by writing custom code rather than declaratively describe how you combine the different python classes. The whole DAG concept is all about it - you should be encouraged to write custom operators and build relations between them in Python code, rather than in configuration file.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #16952: Secrets Backend Search Path Ordering/Priority

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #16952:
URL: https://github.com/apache/airflow/issues/16952#issuecomment-880281931


   I think the priority is correct and will cause confusion if changed later on. And with customer Secrets Backend, you can mix and match however you like. We intentionally did this (same fox XCom Backend) so that companies can create one for their own needs as ONE SIZE DOES NOT FEEL ALL.
   
   That being said, something I had planned earlier, was to allow DAG Authors to pick a single backend to choose the variable or connections from  ( **Not getting configurations from Secrets Backend though** ).
   
   Example: `Variable.get(key="example_key", secrets_backend="airflow.secrets.environment_variables.EnvironmentVariablesBackend")` will only check Environment Variables to get Airflow Variables.
   
   Only changes required:
   
   ```diff
   diff --git a/airflow/models/variable.py b/airflow/models/variable.py
   index 7d4726966..05b266bd3 100644
   --- a/airflow/models/variable.py
   +++ b/airflow/models/variable.py
   @@ -124,6 +124,7 @@ class Variable(Base, LoggingMixin):
            key: str,
            default_var: Any = __NO_DEFAULT_SENTINEL,
            deserialize_json: bool = False,
   +        secrets_backend: Optional[str] = None,
        ) -> Any:
            """
            Gets a value for an Airflow Variable Key
   @@ -132,7 +133,7 @@ class Variable(Base, LoggingMixin):
            :param default_var: Default value of the Variable if the Variable doesn't exists
            :param deserialize_json: Deserialize the value to a Python dict
            """
   -        var_val = Variable.get_variable_from_secrets(key=key)
   +        var_val = Variable.get_variable_from_secrets(key, secrets_backend)
            if var_val is None:
                if default_var is not cls.__NO_DEFAULT_SENTINEL:
                    return default_var
   @@ -193,14 +194,35 @@ class Variable(Base, LoggingMixin):
                self._val = fernet.rotate(self._val.encode('utf-8')).decode()
   
        @staticmethod
   -    def get_variable_from_secrets(key: str) -> Optional[str]:
   +    def get_variable_from_secrets(
   +        key: str,
   +        secrets_backend: Optional[str] = None,
   +    ) -> Optional[str]:
            """
            Get Airflow Variable by iterating over all Secret Backends.
   
            :param key: Variable Key
            :return: Variable Value
            """
   -        for secrets_backend in ensure_secrets_loaded():
   +        secrets_backends = ensure_secrets_loaded()
   +        secrets_backends_classes = {
   +            f"{backend.__class__.__module__}.{backend.__class__.__name__}": backend
   +            for backend in secrets_backends
   +        }
   +
   +        if secrets_backend not in secrets_backends_classes:
   +            raise KeyError(
   +                f"Invalid secrets backend - '{secrets_backend}'. "
   +                f"Should be one of {', '.join(secrets_backends_classes.keys())}"
   +            )
   +
   +        if secrets_backend:
   +            var_val = secrets_backends_classes[secrets_backends].get_variable(key=key)
   +            if var_val is not None:
   +                return var_val
   +            return None
   +
   ```
   
   cc @fhoda 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on issue #16952: Secrets Backend Search Path Ordering/Priority

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on issue #16952:
URL: https://github.com/apache/airflow/issues/16952#issuecomment-880281931


   I think the priority is correct and will cause confusion if changed later on. And with customer Secrets Backend, you can mix and match however you like. We intentionally did this (same fox XCom Backend) so that companies can create one for their own needs as ONE SIZE DOES NOT FEEL ALL.
   
   That being said, something I had planned earlier, was to allow DAG Authors to pick a single backend to choose the variable or connections from  ( **Not getting configurations from Secrets Backend though** ).
   
   For Example the following will only check Environment Variables to get Airflow Variables.: 
   
   ```python
   Variable.get(
       key="example_key",
       secrets_backend="airflow.secrets.environment_variables.EnvironmentVariablesBackend"
   )
   
   ```
   
   Only changes required:
   
   ```diff
   diff --git a/airflow/models/variable.py b/airflow/models/variable.py
   index 7d4726966..05b266bd3 100644
   --- a/airflow/models/variable.py
   +++ b/airflow/models/variable.py
   @@ -124,6 +124,7 @@ class Variable(Base, LoggingMixin):
            key: str,
            default_var: Any = __NO_DEFAULT_SENTINEL,
            deserialize_json: bool = False,
   +        secrets_backend: Optional[str] = None,
        ) -> Any:
            """
            Gets a value for an Airflow Variable Key
   @@ -132,7 +133,7 @@ class Variable(Base, LoggingMixin):
            :param default_var: Default value of the Variable if the Variable doesn't exists
            :param deserialize_json: Deserialize the value to a Python dict
            """
   -        var_val = Variable.get_variable_from_secrets(key=key)
   +        var_val = Variable.get_variable_from_secrets(key, secrets_backend)
            if var_val is None:
                if default_var is not cls.__NO_DEFAULT_SENTINEL:
                    return default_var
   @@ -193,14 +194,35 @@ class Variable(Base, LoggingMixin):
                self._val = fernet.rotate(self._val.encode('utf-8')).decode()
   
        @staticmethod
   -    def get_variable_from_secrets(key: str) -> Optional[str]:
   +    def get_variable_from_secrets(
   +        key: str,
   +        secrets_backend: Optional[str] = None,
   +    ) -> Optional[str]:
            """
            Get Airflow Variable by iterating over all Secret Backends.
   
            :param key: Variable Key
            :return: Variable Value
            """
   -        for secrets_backend in ensure_secrets_loaded():
   +        secrets_backends = ensure_secrets_loaded()
   +        secrets_backends_classes = {
   +            f"{backend.__class__.__module__}.{backend.__class__.__name__}": backend
   +            for backend in secrets_backends
   +        }
   +
   +        if secrets_backend not in secrets_backends_classes:
   +            raise KeyError(
   +                f"Invalid secrets backend - '{secrets_backend}'. "
   +                f"Should be one of {', '.join(secrets_backends_classes.keys())}"
   +            )
   +
   +        if secrets_backend:
   +            var_val = secrets_backends_classes[secrets_backends].get_variable(key=key)
   +            if var_val is not None:
   +                return var_val
   +            return None
   +
   ```
   
   cc @fhoda 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #16952: Secrets Backend Search Path Ordering/Priority

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #16952:
URL: https://github.com/apache/airflow/issues/16952#issuecomment-882074965


   I am with @kaxil  on this one. 
   
   I like the idea of being able to "pin" secret backend specifically as then it pushes the behaviour change and resulting  complexity down to DAG developers - not even to "ops" people of the whole Airflow (so that they deal with consequences but are also in full control of it). 
   
   I think more complexity in configuration and ability to swap things around might create far too much confusion if done at the "configuration" level for the whole Airfloow, even more so when in the future we would like to add multi-tenancy features (that feature sounds like something that poeple would ask their ops people to configure differently for different teams, and resulting troubleshooting might quickly spin out of control if we allow more flexibility here.
   
   I really prefer the situation we have now where we have only one opinionated "sequence" that is supported. At the same time you can write the "Composed" backend easily with your own logic, and - even in multi-tenant cases - each team could have their own custom "Composed" backend if needed likely in multi-tenant scenario rather easily (we could select custom backend per-tenant in configuration).  That sounds much more future-proof IMHO regarding configuration and maintenance complexity.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr edited a comment on issue #16952: Secrets Backend Search Path Ordering/Priority

Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on issue #16952:
URL: https://github.com/apache/airflow/issues/16952#issuecomment-882207637


   Maybe we should ship a secret backend implementation that allows the user to pass multiple secret backends and search them in that order? Something like:
   
   ```python
   class AggregatedBackend(BaseSecretsBackend):
       def __init__(self, backends: List[str]) -> None:
           super().__init__()
           self._backends = [get_function_from_name(name) for name in backends]
   
       def get_variable(self, key: str) -> Optional[str]:
           for backend in self._backends:
               if (value := backend.get_variable(key)) is not None:
                   return value
   ```
   
   This way the user can do something like
   
   ```ini
   [secrets]
   backend = airflow.secrets.aggregated.AggregatedBackend
   backend_kwargs = {"backends": ["airflow.secrets.environment_variables.EnvironmentVariablesBackend", "my_backend.MyBackend", "airflow.secrets.metastore.MetastoreBackend"]}
   ```
   
   We could even introduce some shorthands like `airflow.EnvironmentVariablesBackend`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #16952: Secrets Backend Search Path Ordering/Priority

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #16952:
URL: https://github.com/apache/airflow/issues/16952#issuecomment-882361187


   > Maybe we should ship a secret backend implementation that allows the user to pass multiple secret backends and search them in that order? Something like:
   
   I wanted to do very similar thing with Multi Executor allowing any combination of those (I even had working POC) but eventually we settled on only having CeleryKubernetesExecutor  to handle only this particular case, and I think it was a very good decision. We could focus on only making this case working for all the edge cases and even that proved to take a few releases.
   
   I know it is tempting to provide something like you said, but also it baloons the number of test cases and increases complexity and "supportability surface" immensely. I think when we get to the point where we have to add dictionary of arrays in configuration, it's already pretty bad. You also open up to all the questions when there are "support question" you will first have to understand all the potential paths there, whether None or empty are supported by each backend, and when someone adds their own custom backends to the mix with different behaviours, it might become even more complex. And the next step will be - "I want to have connections retrieved in sequence A/B and "airflow configuration" in sequence B/A". How would we configure that?
   
   This is not theorethical question -  we already had this discussion on how the whole secret backends should behave when then backend is unreachable (temporarily) https://github.com/apache/airflow/issues/14592 - where we dicussed (and agreed on) that when configuration variables are retrieved and secret backend is missing-in-action, Airflow should fail hard because it can start doing stuff that it is not intended to (due to fallback behaviour it could for example start using a wrong database). Those discussions will only increase in complexity if we allow multi-backend to be officially supported by Airflow.
   
   I think it's also more in-line with "philosophy" of Airlfow. I think we already give the users a chance to do this - they can write their own "aggregated" backend  - with less configuration woes, specifically targetting their cases and they can fine tune it's behaviour much better by writing a few lines of python code rather than describing how different python classes are connected via configuration. The whole premise of Airflow is that you are supposed to extend it by writing custom code rather than declaratively describe how you combine the different python classes. The whole DAG concept is all about it - you should be encouraged to write custom operators and build relations between them in Python code, rather than in configuration file.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] fredthomsen commented on issue #16952: Secrets Backend Search Path Ordering/Priority

Posted by GitBox <gi...@apache.org>.
fredthomsen commented on issue #16952:
URL: https://github.com/apache/airflow/issues/16952#issuecomment-886768966


   Thanks for the responses. Based on the feedback given, I'll go ahead and close this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #16952: Secrets Backend Search Path Ordering/Priority

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #16952:
URL: https://github.com/apache/airflow/issues/16952#issuecomment-878507237


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] fredthomsen closed issue #16952: Secrets Backend Search Path Ordering/Priority

Posted by GitBox <gi...@apache.org>.
fredthomsen closed issue #16952:
URL: https://github.com/apache/airflow/issues/16952


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #16952: Secrets Backend Search Path Ordering/Priority

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #16952:
URL: https://github.com/apache/airflow/issues/16952#issuecomment-882361187


   > Maybe we should ship a secret backend implementation that allows the user to pass multiple secret backends and search them in that order? Something like:
   
   I wanted to do very similar thing with Multi Executor allowing any combination of those (I even had working POC) but eventually we settled on only having CeleryKubernetesExecutor  to handle only this particular case, and I think it was a very good decision. We could focus on only making this case working for all the edge cases and even that proved to take a few releases.
   
   I know it is tempting to provide something like you said, but also it baloons the number of test cases and increases complexity and "supportability surface" immensely. I think when we get to the point where we have to add dictionary of arrays in configuration, it's already pretty bad. You also open up to all the questions when there are "support question" you will first have to understand all the potential paths there, whether None or empty are supported by each backend, and when someone adds their own custom backends to the mix with different behaviours, it might become even more complex. And the next step will be - "I want to have connections retrieved in sequence A/B and "airflow configuration" in sequence B/A". How would we configure that?
   
   This is not theorethical question -  we already had this discussion on how the whole secret backends should behave when then backend is unreachable (temporarily) https://github.com/apache/airflow/issues/14592 - where we dicussed (and agreed on) that when configuration variables are retrieved and secret backend is missing-in-action, Airflow should fail hard because it can start doing stuff that it is not intended to (due to fallback behaviour it could for example start using a wrong database). So we already 
   complicate the API of secret backend. Those discussions will only increase in complexity if we allow multi-backend to be officially supported by Airflow.
   
   I think it's also more in-line with "philosophy" of Airlfow. I think we already give the users a chance to do this - they can write their own "aggregated" backend  - with less configuration woes, specifically targetting their cases and they can fine tune it's behaviour much better by writing a few lines of python code rather than describing how different python classes are connected via configuration. The whole premise of Airflow is that you are supposed to extend it by writing custom code rather than declaratively describe how you combine the different python classes. The whole DAG concept is all about it - you should be encouraged to write custom operators and build relations between them in Python code, rather than in configuration file.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #16952: Secrets Backend Search Path Ordering/Priority

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #16952:
URL: https://github.com/apache/airflow/issues/16952#issuecomment-882207637


   Maybe we should ship a secret backend implementation that allows the user to pass multiple secret backends and search them in that order? Something like:
   
   ```python
   class AggregatedBackend(BaseSecretsBackend):
       def __init__(self, backends: List[str]) -> None:
           super().__init__()
           self._backends = [get_function_from_name(name) for name in backends]
   
       def get_variable(self, key: str) -> Optional[str]:
           for backend in self._backends:
               if (value := backend.get_variable(key)) is not None:
                   return value
   ```
   
   This way the user can do something like
   
   ```ini
   [secrets]
   backend = airflow.secrets.aggregated.AggregatedBackend
   backend_kwargs = {"backends": ["airflow.secrets.environment_variables.EnvironmentVariablesBackend", "my_backend.MyBackend", "airflow.secrets.metastore.MetastoreBackend"]}
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] magler commented on issue #16952: Secrets Backend Search Path Ordering/Priority

Posted by GitBox <gi...@apache.org>.
magler commented on issue #16952:
URL: https://github.com/apache/airflow/issues/16952#issuecomment-880259029


   > 
   > 
   > @kaxil WDYT? II like this idea, but I wonder if there was any assumption on the basis of which you chose the current order.
   
   The Secrets backend docs mention a search order when an alternative secrets backend is used:
   https://airflow.apache.org/docs/apache-airflow/stable/security/secrets/secrets-backend/index.html#search-path
   >When looking up a connection/variable, by default Airflow will search environment variables first and metastore database second.
   >
   >If you enable an alternative secrets backend, it will be searched first, followed by environment variables, then metastore. This search ordering is not configurable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #16952: Secrets Backend Search Path Ordering/Priority

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #16952:
URL: https://github.com/apache/airflow/issues/16952#issuecomment-881703940


   > @kaxil I am not suggesting changing the default search order as it exists, just to be able to give the user the ability to change it via some configuration should they choose. I am not following how this is analogous to the xcom backend? Is there a search path there as well?
   
   The analogy is because you can create a single custom XCom backend that can based oj size of the object decide where to push the object like DB, GCS/s3 etc.
   
   Similarly, for Secrets Backend (or for that matter any config where we wallow custom classes) you can put a custom logic to handle however you want. 
   
   In the case of your own Custom Secrets Backend, you could check for Environment Variable first, and then calls a Hashicorp Vault or Google Secret Manager.
   
   Hope it is clearer now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] fredthomsen commented on issue #16952: Secrets Backend Search Path Ordering/Priority

Posted by GitBox <gi...@apache.org>.
fredthomsen commented on issue #16952:
URL: https://github.com/apache/airflow/issues/16952#issuecomment-881555647


   @kaxil I am not suggesting changing the default search order as it exists, just to be able to give the user the ability to change it via some configuration should they choose.   I am not following how this is analogous to the xcom backend?  Is there a search path there as well? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #16952: Secrets Backend Search Path Ordering/Priority

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #16952:
URL: https://github.com/apache/airflow/issues/16952#issuecomment-882361187


   > Maybe we should ship a secret backend implementation that allows the user to pass multiple secret backends and search them in that order? Something like:
   
   I wanted to do very similar thing with Multi Executor allowing any combination of those (I even had working POC) but eventually we settled on only having CeleryKubernetesExecutor  to handle only this particular case, and I think it was a very good decision. We could focus on only making this case working for all the edge cases and even that proved to take a few releases.
   
   I know it is tempting to provide something like you said, but also it baloons the number of test cases and increases complexity and "supportability surface" immensely. I think when we get to the point where we have to add dictionary of arrays in configuration, it's already pretty bad. You also open up to all the questions when there are "support question" you will first have to understand all the potential paths there, whether None or empty are supported by each backend, and when someone adds their own custom backends to the mix with different behaviours, it might become even more complex. And the next step will be - "I want to have connections retrieved in sequence A/B and "airflow configuration" in sequence B/A". How would we configure that?
   
   This is not theorethical question -  we already had this discussion on how the whole secret backends should behave when then backend is unreachable (temporarily) https://github.com/apache/airflow/issues/14592 - where we dicussed (and agreed on) that when configuration variables are retrieved and secret backend is missing-in-action, Airflow should fail hard because it can start doing stuff that it is not intended to (due to fallback behaviour it could for example start using a wrong database). So we already 
   complicate the API of secret backend. Those discussions will only increase in complexity if we allow multi-backend to be officially supported by Airflow.
   
   I think it's also more in-line with "philosophy" of Airlfow. I think we already give the users a chance to do this - they can write their own "aggregated" backend  - with less configuration woes, specifically targetting their cases and they can fine tune it's behaviour much better by writing a few lines of python code rather than describing how different python classes are connected via configuration. The whole premise of Airflow is that you are supposed to extend it by writing custom code rather than declaratively describe how you combine the different python classes. The whole DAG concept is all about it - you should be encouraged to write custom operators and build relations between them in Python code, rather than in configuration file.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on issue #16952: Secrets Backend Search Path Ordering/Priority

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #16952:
URL: https://github.com/apache/airflow/issues/16952#issuecomment-880246050


   @kaxil WDYT? II like this idea, but I wonder if there was any assumption on the basis of which you chose the current order.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org