You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/03/27 23:11:10 UTC

[GitHub] [airflow] kaxil opened a new pull request #7923: [WIP] Get Airflow Connection from Environment Variables & SecretsBackend

kaxil opened a new pull request #7923: [WIP] Get Airflow Connection from Environment Variables & SecretsBackend
URL: https://github.com/apache/airflow/pull/7923
 
 
   Get Airflow Connection from :
   
   - Environment Variables
   - Hashicorp Vault
   - GCP Secrets Manager
   - AWS Secrets Manager
   
   This will reduce database connections significantly .
   
   ---
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [ ] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] potiuk commented on a change in pull request #7923: Get Airflow Variables from Environment Variables

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #7923: Get Airflow Variables from Environment Variables
URL: https://github.com/apache/airflow/pull/7923#discussion_r399640504
 
 

 ##########
 File path: airflow/secrets/metastore.py
 ##########
 @@ -37,3 +37,16 @@ def get_connections(self, conn_id, session=None) -> List[Connection]:
         conn_list = session.query(Connection).filter(Connection.conn_id == conn_id).all()
         session.expunge_all()
         return conn_list
+
+    @provide_session
+    def get_variable(self, key: str, session=None):
+        """
+        Get Airflow Variable from Metadata DB
+
+        :param key: Variable Key
+        :return: Variable Value
+        """
+        from airflow.models.variable import Variable
 
 Review comment:
   My opinion again:
   
   That's true that there are no cyclical imports detected by the parser, But still, logically there are two separate entities in different modules that depend on each other in both directions. For me, this is a classic example of cyclic dependency which should be avoided (maybe not at all cost, but whenever it's possible).
   
   I strongly believe the code is much cleaner and easier to understand when we avoid cyclical dependencies. Usually, cyclical dependency (not import) is a sign that we should improve the code structure (either split responsibilities better or move the classes/objects to a single file if there are strongly coupled). We've already introduced a number of such refactorings and they let us reason about the code better and (for example) introduce those performance improvements for the number of DB queries - because it was much easier to understand what happened.
   
   But that's just opinion (though I believe in ti quite strongly).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] ashb commented on a change in pull request #7923: Get Airflow Variables from Environment Variables

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #7923: Get Airflow Variables from Environment Variables
URL: https://github.com/apache/airflow/pull/7923#discussion_r399639963
 
 

 ##########
 File path: docs/concepts.rst
 ##########
 @@ -687,6 +687,34 @@ or if you need to deserialize a json object from the variable :
 
     echo {{ var.json.<variable_name> }}
 
+Storing Variables in Environment Variables
+------------------------------------------
+
+Airflow Variables can also be created and managed using Environment Variables. The environment variable
+naming convention is ``AIRFLOW_VAR_<variable_name>``, all uppercase.
+So if your variable key is ``FOO`` then the variable name should be ``AIRFLOW_VAR_FOO``.
+
+For example,
+
+.. code:: bash
+
+    export AIRFLOW_VAR_FOO=BAR
+
+    # To use JSON, store them as JSON strings
+    export AIRFLOW_VAR_FOO_JSON='{"hello":"world"}'
 
 Review comment:
   This makes it look like the _JSON suffix on the name is important, but it's not.
   
   Let's change this example name

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] potiuk commented on a change in pull request #7923: Get Airflow Variables from Environment Variables

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #7923: Get Airflow Variables from Environment Variables
URL: https://github.com/apache/airflow/pull/7923#discussion_r399640504
 
 

 ##########
 File path: airflow/secrets/metastore.py
 ##########
 @@ -37,3 +37,16 @@ def get_connections(self, conn_id, session=None) -> List[Connection]:
         conn_list = session.query(Connection).filter(Connection.conn_id == conn_id).all()
         session.expunge_all()
         return conn_list
+
+    @provide_session
+    def get_variable(self, key: str, session=None):
+        """
+        Get Airflow Variable from Metadata DB
+
+        :param key: Variable Key
+        :return: Variable Value
+        """
+        from airflow.models.variable import Variable
 
 Review comment:
   My (general) opinion again - it may apply to similar cases:
   
   That's true that there are no cyclical imports detected by the parser, But still, logically there are two separate entities in different modules that depend on each other in both directions. For me, this is a classic example of cyclic dependency which should be avoided (maybe not at all cost, but whenever it's possible).
   
   I strongly believe the code is much cleaner and easier to understand when we avoid cyclical dependencies. Usually, cyclical dependency (not import) is a sign that we should improve the code structure (either split responsibilities better or move the classes/objects to a single file if there are strongly coupled). We've already introduced a number of such refactorings and they let us reason about the code better and (for example) introduce those performance improvements for the number of DB queries - because it was much easier to understand what the code is doing.
   
   But that's just opinion (though I believe in ti quite strongly).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] kaxil commented on a change in pull request #7923: [WIP] Get Airflow Connection from Environment Variables & SecretsBackend

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #7923: [WIP] Get Airflow Connection from Environment Variables & SecretsBackend
URL: https://github.com/apache/airflow/pull/7923#discussion_r399585371
 
 

 ##########
 File path: airflow/secrets/metastore.py
 ##########
 @@ -37,3 +37,16 @@ def get_connections(self, conn_id, session=None) -> List[Connection]:
         conn_list = session.query(Connection).filter(Connection.conn_id == conn_id).all()
         session.expunge_all()
         return conn_list
+
+    @provide_session
+    def get_variable(self, key: str, session=None):
+        """
+        Get Airflow Variable from Metadata DB
+
+        :param key: Variable Key
+        :return: Variable Value
+        """
+        from airflow.models.variable import Variable
 
 Review comment:
   Yup, I will try to get rid of it as soon as I have the final draft ready :)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] kaxil commented on issue #7923: Get Airflow Variables from Environment Variables

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #7923: Get Airflow Variables from Environment Variables
URL: https://github.com/apache/airflow/pull/7923#issuecomment-605390632
 
 
   Separated PRs into following so that it is easier to review, merge and cherry-pick:
   
   - Hashicorp Vault: https://github.com/apache/airflow/pull/7944
   - AWS Secrets Manager: https://github.com/apache/airflow/pull/7945
   - GCP Secrets Manager: https://github.com/apache/airflow/pull/7946
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] ashb commented on a change in pull request #7923: Get Airflow Variables from Environment Variables

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #7923: Get Airflow Variables from Environment Variables
URL: https://github.com/apache/airflow/pull/7923#discussion_r399639447
 
 

 ##########
 File path: airflow/secrets/metastore.py
 ##########
 @@ -37,3 +37,16 @@ def get_connections(self, conn_id, session=None) -> List[Connection]:
         conn_list = session.query(Connection).filter(Connection.conn_id == conn_id).all()
         session.expunge_all()
         return conn_list
+
+    @provide_session
+    def get_variable(self, key: str, session=None):
+        """
+        Get Airflow Variable from Metadata DB
+
+        :param key: Variable Key
+        :return: Variable Value
+        """
+        from airflow.models.variable import Variable
 
 Review comment:
   Yeah, imports in functions are almost never circular - to be able to call the function one module already needs to be fully loaded already, so a cycle can never result.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] kaxil merged pull request #7923: Get Airflow Variables from Environment Variables

Posted by GitBox <gi...@apache.org>.
kaxil merged pull request #7923: Get Airflow Variables from Environment Variables
URL: https://github.com/apache/airflow/pull/7923
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] potiuk commented on a change in pull request #7923: Get Airflow Variables from Environment Variables

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #7923: Get Airflow Variables from Environment Variables
URL: https://github.com/apache/airflow/pull/7923#discussion_r399640504
 
 

 ##########
 File path: airflow/secrets/metastore.py
 ##########
 @@ -37,3 +37,16 @@ def get_connections(self, conn_id, session=None) -> List[Connection]:
         conn_list = session.query(Connection).filter(Connection.conn_id == conn_id).all()
         session.expunge_all()
         return conn_list
+
+    @provide_session
+    def get_variable(self, key: str, session=None):
+        """
+        Get Airflow Variable from Metadata DB
+
+        :param key: Variable Key
+        :return: Variable Value
+        """
+        from airflow.models.variable import Variable
 
 Review comment:
   My opinion again:
   
   That's true that there are no cyclical imports detected by the parser, But still, logically there are two separate entities in different modules that depend on each other in both directions. For me, this is a classic example of cyclic dependency which should be avoided (maybe not at all cost, but whenever it's possible).
   
   I strongly believe the code is much cleaner and easier to understand when we avoid cyclical dependencies. Usually, cyclical dependency (not import) is a sign that we should improve the code structure (either split responsibilities better or move the classes/objects to a single file if there are strongly coupled). We've already introduced a number of such refactorings and they let us reason about the code better and (for example) introduce those performance improvements for the number of DB queries - because it was much easier to understand what the code is doing.
   
   But that's just opinion (though I believe in ti quite strongly).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj commented on a change in pull request #7923: [WIP] Get Airflow Connection from Environment Variables & SecretsBackend

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #7923: [WIP] Get Airflow Connection from Environment Variables & SecretsBackend
URL: https://github.com/apache/airflow/pull/7923#discussion_r399583454
 
 

 ##########
 File path: airflow/secrets/metastore.py
 ##########
 @@ -37,3 +37,16 @@ def get_connections(self, conn_id, session=None) -> List[Connection]:
         conn_list = session.query(Connection).filter(Connection.conn_id == conn_id).all()
         session.expunge_all()
         return conn_list
+
+    @provide_session
+    def get_variable(self, key: str, session=None):
+        """
+        Get Airflow Variable from Metadata DB
+
+        :param key: Variable Key
+        :return: Variable Value
+        """
+        from airflow.models.variable import Variable
 
 Review comment:
   Cyclical import?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] kaxil commented on a change in pull request #7923: Get Airflow Variables from Environment Variables

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #7923: Get Airflow Variables from Environment Variables
URL: https://github.com/apache/airflow/pull/7923#discussion_r399648015
 
 

 ##########
 File path: docs/concepts.rst
 ##########
 @@ -663,7 +663,7 @@ accessible and modifiable through the UI.
 
 .. code:: python
 
-    from airflow.models import Variable
+    from airflow.models.variable import Variable
 
 Review comment:
   Fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] kaxil commented on a change in pull request #7923: Get Airflow Variables from Environment Variables

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #7923: Get Airflow Variables from Environment Variables
URL: https://github.com/apache/airflow/pull/7923#discussion_r399648000
 
 

 ##########
 File path: docs/concepts.rst
 ##########
 @@ -687,6 +687,34 @@ or if you need to deserialize a json object from the variable :
 
     echo {{ var.json.<variable_name> }}
 
+Storing Variables in Environment Variables
+------------------------------------------
+
+Airflow Variables can also be created and managed using Environment Variables. The environment variable
+naming convention is ``AIRFLOW_VAR_<variable_name>``, all uppercase.
+So if your variable key is ``FOO`` then the variable name should be ``AIRFLOW_VAR_FOO``.
+
+For example,
+
+.. code:: bash
+
+    export AIRFLOW_VAR_FOO=BAR
+
+    # To use JSON, store them as JSON strings
+    export AIRFLOW_VAR_FOO_JSON='{"hello":"world"}'
 
 Review comment:
   Fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] kaxil commented on a change in pull request #7923: Get Airflow Variables from Environment Variables

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #7923: Get Airflow Variables from Environment Variables
URL: https://github.com/apache/airflow/pull/7923#discussion_r399615781
 
 

 ##########
 File path: airflow/secrets/metastore.py
 ##########
 @@ -37,3 +37,16 @@ def get_connections(self, conn_id, session=None) -> List[Connection]:
         conn_list = session.query(Connection).filter(Connection.conn_id == conn_id).all()
         session.expunge_all()
         return conn_list
+
+    @provide_session
+    def get_variable(self, key: str, session=None):
+        """
+        Get Airflow Variable from Metadata DB
+
+        :param key: Variable Key
+        :return: Variable Value
+        """
+        from airflow.models.variable import Variable
 
 Review comment:
   This looks tough as we need Variable model here and we need to get the value in Variables.py file to keep it fully backward compatible

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj commented on a change in pull request #7923: [WIP] Get Airflow Connection from Environment Variables & SecretsBackend

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #7923: [WIP] Get Airflow Connection from Environment Variables & SecretsBackend
URL: https://github.com/apache/airflow/pull/7923#discussion_r399583454
 
 

 ##########
 File path: airflow/secrets/metastore.py
 ##########
 @@ -37,3 +37,16 @@ def get_connections(self, conn_id, session=None) -> List[Connection]:
         conn_list = session.query(Connection).filter(Connection.conn_id == conn_id).all()
         session.expunge_all()
         return conn_list
+
+    @provide_session
+    def get_variable(self, key: str, session=None):
+        """
+        Get Airflow Variable from Metadata DB
+
+        :param key: Variable Key
+        :return: Variable Value
+        """
+        from airflow.models.variable import Variable
 
 Review comment:
   Cyclical import? 😿 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] ashb commented on a change in pull request #7923: Get Airflow Variables from Environment Variables

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #7923: Get Airflow Variables from Environment Variables
URL: https://github.com/apache/airflow/pull/7923#discussion_r399639516
 
 

 ##########
 File path: docs/concepts.rst
 ##########
 @@ -663,7 +663,7 @@ accessible and modifiable through the UI.
 
 .. code:: python
 
-    from airflow.models import Variable
+    from airflow.models.variable import Variable
 
 Review comment:
   Did you mean to change this?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services