You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/07/19 08:44:03 UTC

[GitHub] [airflow] potiuk edited a comment on issue #16952: Secrets Backend Search Path Ordering/Priority

potiuk edited a comment on issue #16952:
URL: https://github.com/apache/airflow/issues/16952#issuecomment-882361187

> Maybe we should ship a secret backend implementation that allows the user to pass multiple secret backends and search them in that order? Something like:

I wanted to do very similar thing with Multi Executor allowing any combination of those (I even had working POC) but eventually we settled on only having CeleryKubernetesExecutor to handle only this particular case, and I think it was a very good decision. We could focus on only making this case working for all the edge cases and even that proved to take a few releases.

I know it is tempting to provide something like you said, but also it baloons the number of test cases and increases complexity and "supportability surface" immensely. I think when we get to the point where we have to add dictionary of arrays in configuration, it's already pretty bad. You also open up to all the questions when there are "support question" you will first have to understand all the potential paths there, whether None or empty are supported by each backend, and when someone adds their own custom backends to the mix with different behaviours, it might become even more complex. And the next step will be - "I want to have connections retrieved in sequence A/B and "airflow configuration" in sequence B/A". How would we configure that?

This is not theorethical question - we already had this discussion on how the whole secret backends should behave when then backend is unreachable (temporarily) https://github.com/apache/airflow/issues/14592 - where we dicussed (and agreed on) that when configuration variables are retrieved and secret backend is missing-in-action, Airflow should fail hard because it can start doing stuff that it is not intended to (due to fallback behaviour it could for example start using a wrong database). So we already
complicate the API of secret backend. Those discussions will only increase in complexity if we allow multi-backend to be officially supported by Airflow.

I think it's also more in-line with "philosophy" of Airlfow. I think we already give the users a chance to do this - they can write their own "aggregated" backend - with less configuration woes, specifically targetting their cases and they can fine tune it's behaviour much better by writing a few lines of python code rather than describing how different python classes are connected via configuration. The whole premise of Airflow is that you are supposed to extend it by writing custom code rather than declaratively describe how you combine the different python classes. The whole DAG concept is all about it - you should be encouraged to write custom operators and build relations between them in Python code, rather than in configuration file.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org