You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/02/23 08:44:25 UTC

[GitHub] [airflow] haard opened a new pull request #14380: Prevent unknown env vars from crashing a worker

haard opened a new pull request #14380:
URL: https://github.com/apache/airflow/pull/14380


   An environment variable that almost looks like an Airflow var but is not (e.g `AIRFLOW__SOME__brokenthing`) leads to a NPE in `airflow/configuration.py`. In addition, when using the Celery Executor and triggering this exception, the context is silenced in the log message (using `.error` instead of `.exception`) which makes troubleshooting needlessly hard.
   
   This patch ignores but logs unexpected env vars, and uses `log.exception` when running a Celery task fails.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] XD-DENG merged pull request #14380: Prevent mixed case env vars from crashing processes like worker

Posted by GitBox <gi...@apache.org>.
XD-DENG merged pull request #14380:
URL: https://github.com/apache/airflow/pull/14380


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] XD-DENG commented on pull request #14380: Prevent mixed case env vars from crashing a worker

Posted by GitBox <gi...@apache.org>.
XD-DENG commented on pull request #14380:
URL: https://github.com/apache/airflow/pull/14380#issuecomment-794413478


   Thanks @haard for the PR. I understand the only functional difference here is to avoid crash if there is "_mixed case env vars (starting with AIRFLOW__ but then not all uppercase)_". From that perspective, I agree this PR does its job and LGTM.
   
   What we may want to discuss a bit further are different options:
   - Option A: allow lowercases, and help users convert to upper case if necessary. This may not make sense given environment variables are case sensitive in Unix. But of course there may be different opinions.
   - Option B: don't allow lowercase, and choose to ignore (what you are doing)
   - Option C: don't allow lowercase, and reject explicitly. Maybe not making sense, because users may purposely want `AIRFLOW__CORE__lowercasekey` for whatever reason.
   
   So overall, I agree with your direction, i.e. option B.
   
   @ashb , may need your input here. Need your approval to unblock anyway :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] haard commented on pull request #14380: Prevent mixed case env vars from crashing a worker

Posted by GitBox <gi...@apache.org>.
haard commented on pull request #14380:
URL: https://github.com/apache/airflow/pull/14380#issuecomment-784105287


   > 
   > 
   > Hi @haard I assume you meant the cases where the env var cannot be found properly if the _case_ is not given correctly?
   
   Yes, sorry. Edited the description, I had started writing it before I pinned down the issue properly.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] XD-DENG commented on a change in pull request #14380: Prevent mixed case env vars from crashing a worker

Posted by GitBox <gi...@apache.org>.
XD-DENG commented on a change in pull request #14380:
URL: https://github.com/apache/airflow/pull/14380#discussion_r590685699



##########
File path: airflow/executors/celery_executor.py
##########
@@ -116,7 +116,7 @@ def _execute_in_fork(command_to_exec: CommandType) -> None:
         args.func(args)
         ret = 0
     except Exception as e:  # pylint: disable=broad-except
-        log.error("Failed to execute task %s.", str(e))
+        log.exception("Failed to execute task %s.", str(e))

Review comment:
       I assume the purpose of this change is to have trackback info in logs?
   
   If there is other purpose(s) for this change, please clarify so we have full info. Thanks.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #14380: Prevent mixed case env vars from crashing processes like worker

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #14380:
URL: https://github.com/apache/airflow/pull/14380#issuecomment-795549893


   Awesome work, congrats on your first merged pull request!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #14380: Prevent unknown env vars from crashing a worker

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #14380:
URL: https://github.com/apache/airflow/pull/14380#discussion_r580891486



##########
File path: tests/core/test_configuration.py
##########
@@ -80,6 +80,15 @@ def test_case_sensitivity(self):
         assert conf.get("core", "PERCENT") == "with%inside"
         assert conf.get("CORE", "PERCENT") == "with%inside"
 
+    def test_config_as_dict(self):
+        """Test that getting config as dict works even if
+        environment has non-legal env vars"""
+        with unittest.mock.patch.dict('os.environ'):
+            os.environ['AIRFLOW__VAR__broken'] = "not_ok"
+            asdict = conf.as_dict(raw=True, display_sensitive=True)
+        assert asdict.get('VAR') is None

Review comment:
       This is not the behaviour I would expect. It should instead still be present.
   
   There are a few plugins out there that add their own config sections, and that is something we want to support.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] XD-DENG commented on a change in pull request #14380: Prevent mixed case env vars from crashing a worker

Posted by GitBox <gi...@apache.org>.
XD-DENG commented on a change in pull request #14380:
URL: https://github.com/apache/airflow/pull/14380#discussion_r590700456



##########
File path: tests/core/test_configuration.py
##########
@@ -80,6 +80,15 @@ def test_case_sensitivity(self):
         assert conf.get("core", "PERCENT") == "with%inside"
         assert conf.get("CORE", "PERCENT") == "with%inside"
 
+    def test_config_as_dict(self):
+        """Test that getting config as dict works even if
+        environment has non-legal env vars"""
+        with unittest.mock.patch.dict('os.environ'):
+            os.environ['AIRFLOW__VAR__broken'] = "not_ok"
+            asdict = conf.as_dict(raw=True, display_sensitive=True)
+        assert asdict.get('VAR') is None

Review comment:
       I think what @ashb wants is actually aligned with what @haard is trying to do here.
   
   We want to support the user-defined config sections/keys. But if user gives an env var like `AIRFLOW__S__key`, they expect to have something like `conf.as_dict().get('S').get('key')` OR `conf.as_dict().get('S').get('KEY')`. But actually the current behaviour is the whole process crashes.
   
   What @haard does here is to avoid crash + enforce all-uppercase env var (otherwise ignored). Personally I'm ok with this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] XD-DENG commented on pull request #14380: Prevent unknown env vars from crashing a worker

Posted by GitBox <gi...@apache.org>.
XD-DENG commented on pull request #14380:
URL: https://github.com/apache/airflow/pull/14380#issuecomment-784037019


   Hi @haard I assume you meant the cases where the env var cannot be found properly if the *case* is not given correctly?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] XD-DENG commented on pull request #14380: Prevent mixed case env vars from crashing a worker

Posted by GitBox <gi...@apache.org>.
XD-DENG commented on pull request #14380:
URL: https://github.com/apache/airflow/pull/14380#issuecomment-795543755


   The CI errors here are the same to the ones @kaxil has described in https://github.com/apache/airflow/pull/14694 .
   Have validated that the tests relevant to this PR itself have passed. Gonna merge.
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] haard commented on a change in pull request #14380: Prevent mixed case env vars from crashing a worker

Posted by GitBox <gi...@apache.org>.
haard commented on a change in pull request #14380:
URL: https://github.com/apache/airflow/pull/14380#discussion_r580932236



##########
File path: tests/core/test_configuration.py
##########
@@ -80,6 +80,15 @@ def test_case_sensitivity(self):
         assert conf.get("core", "PERCENT") == "with%inside"
         assert conf.get("CORE", "PERCENT") == "with%inside"
 
+    def test_config_as_dict(self):
+        """Test that getting config as dict works even if
+        environment has non-legal env vars"""
+        with unittest.mock.patch.dict('os.environ'):
+            os.environ['AIRFLOW__VAR__broken'] = "not_ok"
+            asdict = conf.as_dict(raw=True, display_sensitive=True)
+        assert asdict.get('VAR') is None

Review comment:
       
   > This is not the behaviour I would expect. It should instead still be present.
   > 
   > There are a few plugins out there that add their own config sections, and that is something we want to support.
   
   If they do and do not enforce ALL_UPPER_CASE they will break if using `as_dict()` with the current code, with this patch they will be ignored but not crash. I don't know that allowing mixed case env vars is wanted?
   The only functional change here is to not crash if `is_dict` is called while there is mixed case env vars (starting with `AIRFLOW__` but then not all uppercase). 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #14380: Prevent mixed case env vars from crashing a worker

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #14380:
URL: https://github.com/apache/airflow/pull/14380#discussion_r588768950



##########
File path: tests/core/test_configuration.py
##########
@@ -80,6 +80,15 @@ def test_case_sensitivity(self):
         assert conf.get("core", "PERCENT") == "with%inside"
         assert conf.get("CORE", "PERCENT") == "with%inside"
 
+    def test_config_as_dict(self):
+        """Test that getting config as dict works even if
+        environment has non-legal env vars"""
+        with unittest.mock.patch.dict('os.environ'):
+            os.environ['AIRFLOW__VAR__broken'] = "not_ok"
+            asdict = conf.as_dict(raw=True, display_sensitive=True)
+        assert asdict.get('VAR') is None

Review comment:
       I'll take a look on Monday if XD doesn't get to it first




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] haard commented on pull request #14380: Prevent mixed case env vars from crashing a worker

Posted by GitBox <gi...@apache.org>.
haard commented on pull request #14380:
URL: https://github.com/apache/airflow/pull/14380#issuecomment-795200592


   > 
   > 
   > Thanks again @haard . LGTM.
   > 
   > May you please rebase to the latest `master` branch before we can merge? Thanks
   
   Thanks, done!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] haard commented on a change in pull request #14380: Prevent mixed case env vars from crashing a worker

Posted by GitBox <gi...@apache.org>.
haard commented on a change in pull request #14380:
URL: https://github.com/apache/airflow/pull/14380#discussion_r590249604



##########
File path: tests/core/test_configuration.py
##########
@@ -80,6 +80,15 @@ def test_case_sensitivity(self):
         assert conf.get("core", "PERCENT") == "with%inside"
         assert conf.get("CORE", "PERCENT") == "with%inside"
 
+    def test_config_as_dict(self):
+        """Test that getting config as dict works even if
+        environment has non-legal env vars"""
+        with unittest.mock.patch.dict('os.environ'):
+            os.environ['AIRFLOW__VAR__broken'] = "not_ok"
+            asdict = conf.as_dict(raw=True, display_sensitive=True)
+        assert asdict.get('VAR') is None

Review comment:
       Ping @ashb @XD-DENG 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] haard commented on a change in pull request #14380: Prevent mixed case env vars from crashing a worker

Posted by GitBox <gi...@apache.org>.
haard commented on a change in pull request #14380:
URL: https://github.com/apache/airflow/pull/14380#discussion_r588754638



##########
File path: tests/core/test_configuration.py
##########
@@ -80,6 +80,15 @@ def test_case_sensitivity(self):
         assert conf.get("core", "PERCENT") == "with%inside"
         assert conf.get("CORE", "PERCENT") == "with%inside"
 
+    def test_config_as_dict(self):
+        """Test that getting config as dict works even if
+        environment has non-legal env vars"""
+        with unittest.mock.patch.dict('os.environ'):
+            os.environ['AIRFLOW__VAR__broken'] = "not_ok"
+            asdict = conf.as_dict(raw=True, display_sensitive=True)
+        assert asdict.get('VAR') is None

Review comment:
       @ashb @XD-DENG I believe the PR behaviour is conservative (fixing the crash, not adding behaviour). Adding the inclusion of the MIXED_case env vars without breaking other things in the configuration module looks like it would take larger (in scope) changes. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] haard commented on a change in pull request #14380: Prevent mixed case env vars from crashing a worker

Posted by GitBox <gi...@apache.org>.
haard commented on a change in pull request #14380:
URL: https://github.com/apache/airflow/pull/14380#discussion_r590704917



##########
File path: airflow/executors/celery_executor.py
##########
@@ -116,7 +116,7 @@ def _execute_in_fork(command_to_exec: CommandType) -> None:
         args.func(args)
         ret = 0
     except Exception as e:  # pylint: disable=broad-except
-        log.error("Failed to execute task %s.", str(e))
+        log.exception("Failed to execute task %s.", str(e))

Review comment:
       Yes, this is only to get the stack trace, since otherwise there is very little to help troubleshooting - I ran with this patch to be able to find the issue with mixed case env vars at all.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #14380: Prevent mixed case env vars from crashing a worker

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #14380:
URL: https://github.com/apache/airflow/pull/14380#issuecomment-794487863


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #14380: Prevent unknown env vars from crashing a worker

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #14380:
URL: https://github.com/apache/airflow/pull/14380#issuecomment-784009644


   Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, pylint and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/master/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze environment](https://github.com/apache/airflow/blob/master/BREEZE.rst) for testing locally, itโ€™s a heavy docker but it ships with a working Airflow and a lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
   - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
   - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it better ๐Ÿš€.
   In case of doubts contact the developers at:
   Mailing List: dev@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org