You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by po...@apache.org on 2021/09/18 17:16:49 UTC

[airflow] branch main updated: Explain sentry default environment variable for subprocess hook (#18346)

This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
     new 50e2e6e  Explain sentry default environment variable for subprocess hook (#18346)
50e2e6e is described below

commit 50e2e6e5e7c6c758f051daec7eda7c147e9e8bda
Author: Jarek Potiuk <ja...@potiuk.com>
AuthorDate: Sat Sep 18 19:16:33 2021 +0200

    Explain sentry default environment variable for subprocess hook (#18346)
    
    * Explain sentry default environment variable for subprocess hook
    
    Sentry by default monkey-patches standard library POpen function
    to capture and pass current process' environment to subprocess
    with `SUBPROCESS_` prefix. This break SubprocessHook tests
    when sentry tests are run before SubprocessHook tests, and also
    it modifies SubprocessHook behaviour (and promise) in production
    environment as well.
    
    This PR:
    
    * adds documentation to both sentry documentation and the
      SubprocessHook documentation explaining the interaction between
      the two
    * Adds documentation explaining how to disable this default
      Sentry behaviour
    * disables default integrations in the Sentry tests to avoid
      side-effects
    
    Fixes: #18268
---
 airflow/hooks/subprocess.py                       |  3 +++
 docs/apache-airflow/logging-monitoring/errors.rst | 21 +++++++++++++++++++++
 tests/core/test_sentry.py                         |  2 +-
 3 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/airflow/hooks/subprocess.py b/airflow/hooks/subprocess.py
index c54a818..f04b2e7 100644
--- a/airflow/hooks/subprocess.py
+++ b/airflow/hooks/subprocess.py
@@ -50,6 +50,9 @@ class SubprocessHook(BaseHook):
         :param command: the command to run
         :param env: Optional dict containing environment variables to be made available to the shell
             environment in which ``command`` will be executed.  If omitted, ``os.environ`` will be used.
+            Note, that in case you have Sentry configured, original variables from the environment
+            will also be passed to the subprocess with ``SUBPROCESS_`` prefix. See
+            :doc:`/logging-monitoring/errors` for details.
         :param output_encoding: encoding to use for decoding stdout
         :param cwd: Working directory to run the command in.
             If None (default), the command is run in a temporary directory.
diff --git a/docs/apache-airflow/logging-monitoring/errors.rst b/docs/apache-airflow/logging-monitoring/errors.rst
index 578666b..c8c303d 100644
--- a/docs/apache-airflow/logging-monitoring/errors.rst
+++ b/docs/apache-airflow/logging-monitoring/errors.rst
@@ -70,3 +70,24 @@ Name                                    Description
 ``completed_tasks[operator]``           Task operator of task that executed before failed task
 ``completed_tasks[duration]``           Duration in seconds of task that executed before failed task
 ======================================= ==============================================================
+
+
+Impact of Sentry on Environment variables passed to Subprocess Hook
+-------------------------------------------------------------------
+
+When Sentry is enabled, by default it changes standard library to pass all environment variables to
+subprocesses opened by Airflow. This changes the default behaviour of
+:class:`airflow.hooks.subprocess.SubprocessHook` - always all environment variables are passed to the
+subprocess executed with specific set of environment variables. In this case not only the specified
+environment variables are passed but also all existing environment variables are passed with
+``SUBPROCESS_`` prefix added. This happens also for all other subprocesses.
+
+This behaviour can be disabled by setting ``default_integrations`` sentry configuration parameter to
+``False`` which disables ``StdlibIntegration``. This however also disables other default integrations
+and you need to enable them manually if you want to get them enabled,
+see `Sentry Default Integrations <https://docs.sentry.io/platforms/python/guides/wsgi/configuration/integrations/default-integrations/>`_
+
+.. code-block:: ini
+
+    [sentry]
+    default_integrations = False
diff --git a/tests/core/test_sentry.py b/tests/core/test_sentry.py
index 5a5fa56..bb962cc 100644
--- a/tests/core/test_sentry.py
+++ b/tests/core/test_sentry.py
@@ -77,7 +77,7 @@ class TestSentryHook:
 
     @pytest.fixture
     def sentry(self):
-        with conf_vars({('sentry', 'sentry_on'): 'True'}):
+        with conf_vars({('sentry', 'sentry_on'): 'True', ('sentry', 'default_integrations'): 'False'}):
             from airflow import sentry
 
             importlib.reload(sentry)