You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/12/08 21:23:55 UTC

[GitHub] [airflow] csm10495 opened a new pull request, #28239: Add a new config for adapting masked secrets to make it easier to prevent secret leakage in logs

csm10495 opened a new pull request, #28239:
URL: https://github.com/apache/airflow/pull/28239

   We've hit some cases where secrets accidentally leak because they take slightly different forms than expected in logs. One of example of this was a password was urlquoted then logged as part of a url. 
   
   Using this new config, we can specify that we want to mask secrets and adaptations of them (like urlquoted) versions. A custom function can also be written that returns a list of adaptations to filter in logs as well.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ashb commented on a diff in pull request #28239: Add a new config for adapting masked secrets to make it easier to prevent secret leakage in logs

Posted by GitBox <gi...@apache.org>.
ashb commented on code in PR #28239:
URL: https://github.com/apache/airflow/pull/28239#discussion_r1047659711


##########
airflow/utils/log/secrets_masker.py:
##########
@@ -240,17 +241,53 @@ def redact(self, item: Redactable, name: str | None = None) -> Redacted:
         """
         return self._redact(item, name, depth=0)
 
-    def add_mask(self, secret: str | dict | Iterable, name: str | None = None):
-        """Add a new secret to be masked to this filter instance."""
+    @cached_property
+    def _mask_adapter(self) -> None | Callable:
+        """Pulls the secret mask adapter from config.
+
+        This lives in a function here to be cached and only hit the config once.
+        """
         from airflow.configuration import conf
 
-        test_mode: bool = conf.getboolean("core", "unit_test_mode")
+        return conf.getimport("logging", "secret_mask_adapter", fallback=None)
+
+    @cached_property
+    def _test_mode(self) -> bool:
+        """Pulls the unit test mode flag from config.
+
+        This lives in a function here to be cached and only hit the config once.
+        """
+        from airflow.configuration import conf
+
+        return conf.getboolean("core", "unit_test_mode")
+
+    def _add_adaptations(self, secret: str | dict | Iterable, name: str | None = None):
+        """Adds any adaptations to the secret that should be masked."""
+        if self._mask_adapter:
+            # This can return an iterable of secrets to mask OR a single secret as a string
+            secret_or_secrets = self._mask_adapter(secret)
+
+            if not isinstance(secret_or_secrets, str):
+                # if its not a string, it must be an iterable
+                for secret in secret_or_secrets:
+                    self.add_mask(secret, name, add_adaptations=False)
+            else:
+                self.add_mask(secret_or_secrets, name, add_adaptations=False)
+
+    def add_mask(self, secret: str | dict | Iterable, name: str | None = None, add_adaptations: bool = True):

Review Comment:
   Do we need the `add_adaptations` param? When (and how) would that ever be set to false?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ashb commented on a diff in pull request #28239: Add a new config for adapting masked secrets to make it easier to prevent secret leakage in logs

Posted by GitBox <gi...@apache.org>.
ashb commented on code in PR #28239:
URL: https://github.com/apache/airflow/pull/28239#discussion_r1047655862


##########
airflow/utils/log/secrets_masker.py:
##########
@@ -240,17 +241,58 @@ def redact(self, item: Redactable, name: str | None = None) -> Redacted:
         """
         return self._redact(item, name, depth=0)
 
-    def add_mask(self, secret: str | dict | Iterable, name: str | None = None):
-        """Add a new secret to be masked to this filter instance."""
+    @cached_property
+    def _mask_adapter(self) -> None | Callable:
+        """Pulls the secret mask adapter from config.
+
+        This lives in a function here to be cached and only hit the config once.
+        """
         from airflow.configuration import conf
 
-        test_mode: bool = conf.getboolean("core", "unit_test_mode")
+        mask_adapter: Callable | None = None
+        try:
+            mask_adapter = conf.getimport("logging", "secret_mask_adapter")
+        except AirflowConfigException:
+            pass
+        return mask_adapter

Review Comment:
   I think this would work as
   
   ```suggestion
           return conf.getimport("logging", "secret_mask_adapter", fallback=None)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ashb merged pull request #28239: Add a new config for adapting masked secrets to make it easier to prevent secret leakage in logs

Posted by GitBox <gi...@apache.org>.
ashb merged PR #28239:
URL: https://github.com/apache/airflow/pull/28239


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] csm10495 commented on pull request #28239: Add a new config for adapting masked secrets to make it easier to prevent secret leakage in logs

Posted by GitBox <gi...@apache.org>.
csm10495 commented on PR #28239:
URL: https://github.com/apache/airflow/pull/28239#issuecomment-1349620878

   I think i have to go back to repo to put together those suggestions.. will do that a bit later. Check again tomorrow @ashb . Thanks! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] csm10495 commented on a diff in pull request #28239: Add a new config for adapting masked secrets to make it easier to prevent secret leakage in logs

Posted by GitBox <gi...@apache.org>.
csm10495 commented on code in PR #28239:
URL: https://github.com/apache/airflow/pull/28239#discussion_r1043842130


##########
airflow/config_templates/config.yml:
##########
@@ -713,6 +713,17 @@
       type: string
       example: ~
       default: "airflow.utils.log.timezone_aware.TimezoneAware"
+    - name: secret_mask_adapter
+      description: |
+        An import path to a function to add adaptations of each secret added with
+        `airflow.utils.log.secrets_masker.mask_secret` to the log message. The given function
+        is expected to require a single parameter: the secret to be adapted. It may return a
+        single adaptation of the secret or an iterable of adaptations to each be masked as secrets.
+        The original secret will be masked as well as any adaptations returned.
+      version_added: 2.5.0

Review Comment:
   Dang. I'm behind the times. Thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on a diff in pull request #28239: Add a new config for adapting masked secrets to make it easier to prevent secret leakage in logs

Posted by GitBox <gi...@apache.org>.
potiuk commented on code in PR #28239:
URL: https://github.com/apache/airflow/pull/28239#discussion_r1043841553


##########
airflow/config_templates/config.yml:
##########
@@ -713,6 +713,17 @@
       type: string
       example: ~
       default: "airflow.utils.log.timezone_aware.TimezoneAware"
+    - name: secret_mask_adapter
+      description: |
+        An import path to a function to add adaptations of each secret added with
+        `airflow.utils.log.secrets_masker.mask_secret` to the log message. The given function
+        is expected to require a single parameter: the secret to be adapted. It may return a
+        single adaptation of the secret or an iterable of adaptations to each be masked as secrets.
+        The original secret will be masked as well as any adaptations returned.
+      version_added: 2.5.0

Review Comment:
   ```suggestion
         version_added: 2.6.0
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ashb commented on a diff in pull request #28239: Add a new config for adapting masked secrets to make it easier to prevent secret leakage in logs

Posted by GitBox <gi...@apache.org>.
ashb commented on code in PR #28239:
URL: https://github.com/apache/airflow/pull/28239#discussion_r1047660905


##########
airflow/utils/log/secrets_masker.py:
##########
@@ -240,17 +241,53 @@ def redact(self, item: Redactable, name: str | None = None) -> Redacted:
         """
         return self._redact(item, name, depth=0)
 
-    def add_mask(self, secret: str | dict | Iterable, name: str | None = None):
-        """Add a new secret to be masked to this filter instance."""
+    @cached_property
+    def _mask_adapter(self) -> None | Callable:
+        """Pulls the secret mask adapter from config.
+
+        This lives in a function here to be cached and only hit the config once.
+        """
         from airflow.configuration import conf
 
-        test_mode: bool = conf.getboolean("core", "unit_test_mode")
+        return conf.getimport("logging", "secret_mask_adapter", fallback=None)
+
+    @cached_property
+    def _test_mode(self) -> bool:
+        """Pulls the unit test mode flag from config.
+
+        This lives in a function here to be cached and only hit the config once.
+        """
+        from airflow.configuration import conf
+
+        return conf.getboolean("core", "unit_test_mode")
+
+    def _add_adaptations(self, secret: str | dict | Iterable, name: str | None = None):
+        """Adds any adaptations to the secret that should be masked."""
+        if self._mask_adapter:
+            # This can return an iterable of secrets to mask OR a single secret as a string
+            secret_or_secrets = self._mask_adapter(secret)
+
+            if not isinstance(secret_or_secrets, str):
+                # if its not a string, it must be an iterable
+                for secret in secret_or_secrets:
+                    self.add_mask(secret, name, add_adaptations=False)
+            else:
+                self.add_mask(secret_or_secrets, name, add_adaptations=False)
+
+    def add_mask(self, secret: str | dict | Iterable, name: str | None = None, add_adaptations: bool = True):

Review Comment:
   Oh I see.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ashb commented on a diff in pull request #28239: Add a new config for adapting masked secrets to make it easier to prevent secret leakage in logs

Posted by GitBox <gi...@apache.org>.
ashb commented on code in PR #28239:
URL: https://github.com/apache/airflow/pull/28239#discussion_r1047675719


##########
airflow/utils/log/secrets_masker.py:
##########
@@ -240,17 +241,53 @@ def redact(self, item: Redactable, name: str | None = None) -> Redacted:
         """
         return self._redact(item, name, depth=0)
 
-    def add_mask(self, secret: str | dict | Iterable, name: str | None = None):
-        """Add a new secret to be masked to this filter instance."""
+    @cached_property
+    def _mask_adapter(self) -> None | Callable:
+        """Pulls the secret mask adapter from config.
+
+        This lives in a function here to be cached and only hit the config once.
+        """
         from airflow.configuration import conf
 
-        test_mode: bool = conf.getboolean("core", "unit_test_mode")
+        return conf.getimport("logging", "secret_mask_adapter", fallback=None)
+
+    @cached_property
+    def _test_mode(self) -> bool:
+        """Pulls the unit test mode flag from config.
+
+        This lives in a function here to be cached and only hit the config once.
+        """
+        from airflow.configuration import conf
+
+        return conf.getboolean("core", "unit_test_mode")
+
+    def _add_adaptations(self, secret: str | dict | Iterable, name: str | None = None):
+        """Adds any adaptations to the secret that should be masked."""
+        if self._mask_adapter:
+            # This can return an iterable of secrets to mask OR a single secret as a string
+            secret_or_secrets = self._mask_adapter(secret)
+
+            if not isinstance(secret_or_secrets, str):
+                # if its not a string, it must be an iterable
+                for secret in secret_or_secrets:
+                    self.add_mask(secret, name, add_adaptations=False)
+            else:
+                self.add_mask(secret_or_secrets, name, add_adaptations=False)
+
+    def add_mask(self, secret: str | dict | Iterable, name: str | None = None, add_adaptations: bool = True):
+        """Add a new secret to be masked to this filter instance.
+
+        If add_adaptations is True, the secret mask adapter will be used to add adaptations for the secret
+        as well.
+        """
         if isinstance(secret, dict):
             for k, v in secret.items():
                 self.add_mask(v, k)
         elif isinstance(secret, str):
-            if not secret or (test_mode and secret in SECRETS_TO_SKIP_MASKING_FOR_TESTS):
+            if not secret or (self._test_mode and secret in SECRETS_TO_SKIP_MASKING_FOR_TESTS):
                 return
+            if add_adaptations:
+                self._add_adaptations(secret, name)
             pattern = re.escape(secret)
             if pattern not in self.patterns and (not name or should_hide_value_for_key(name)):
                 self.patterns.add(pattern)

Review Comment:
   This suggestion won't blindly apply, but we should remove the current `self.replacer = re.compile("|".join(self.patterns))` line
   
   ```suggestion
               new_mask = False
               for s in self._adaptations(secret):
                   pattern = re.escape(secret)
                   if pattern not in self.patterns and (not name or should_hide_value_for_key(name)):
                       self.patterns.add(pattern)
                       new_mask = True
           if new_mask:
               self.replacer = re.compile("|".join(self.patterns))                    
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ashb commented on a diff in pull request #28239: Add a new config for adapting masked secrets to make it easier to prevent secret leakage in logs

Posted by GitBox <gi...@apache.org>.
ashb commented on code in PR #28239:
URL: https://github.com/apache/airflow/pull/28239#discussion_r1047679429


##########
airflow/utils/log/secrets_masker.py:
##########
@@ -240,17 +241,53 @@ def redact(self, item: Redactable, name: str | None = None) -> Redacted:
         """
         return self._redact(item, name, depth=0)
 
-    def add_mask(self, secret: str | dict | Iterable, name: str | None = None):
-        """Add a new secret to be masked to this filter instance."""
+    @cached_property
+    def _mask_adapter(self) -> None | Callable:
+        """Pulls the secret mask adapter from config.
+
+        This lives in a function here to be cached and only hit the config once.
+        """
         from airflow.configuration import conf
 
-        test_mode: bool = conf.getboolean("core", "unit_test_mode")
+        return conf.getimport("logging", "secret_mask_adapter", fallback=None)
+
+    @cached_property
+    def _test_mode(self) -> bool:
+        """Pulls the unit test mode flag from config.
+
+        This lives in a function here to be cached and only hit the config once.
+        """
+        from airflow.configuration import conf
+
+        return conf.getboolean("core", "unit_test_mode")
+
+    def _add_adaptations(self, secret: str | dict | Iterable, name: str | None = None):
+        """Adds any adaptations to the secret that should be masked."""
+        if self._mask_adapter:
+            # This can return an iterable of secrets to mask OR a single secret as a string
+            secret_or_secrets = self._mask_adapter(secret)
+
+            if not isinstance(secret_or_secrets, str):
+                # if its not a string, it must be an iterable
+                for secret in secret_or_secrets:
+                    self.add_mask(secret, name, add_adaptations=False)
+            else:
+                self.add_mask(secret_or_secrets, name, add_adaptations=False)
+
+    def add_mask(self, secret: str | dict | Iterable, name: str | None = None, add_adaptations: bool = True):
+        """Add a new secret to be masked to this filter instance.
+
+        If add_adaptations is True, the secret mask adapter will be used to add adaptations for the secret
+        as well.
+        """
         if isinstance(secret, dict):
             for k, v in secret.items():
                 self.add_mask(v, k)
         elif isinstance(secret, str):
-            if not secret or (test_mode and secret in SECRETS_TO_SKIP_MASKING_FOR_TESTS):
+            if not secret or (self._test_mode and secret in SECRETS_TO_SKIP_MASKING_FOR_TESTS):
                 return
+            if add_adaptations:
+                self._add_adaptations(secret, name)

Review Comment:
   ```suggestion
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ashb commented on a diff in pull request #28239: Add a new config for adapting masked secrets to make it easier to prevent secret leakage in logs

Posted by GitBox <gi...@apache.org>.
ashb commented on code in PR #28239:
URL: https://github.com/apache/airflow/pull/28239#discussion_r1047671128


##########
airflow/utils/log/secrets_masker.py:
##########
@@ -240,17 +241,53 @@ def redact(self, item: Redactable, name: str | None = None) -> Redacted:
         """
         return self._redact(item, name, depth=0)
 
-    def add_mask(self, secret: str | dict | Iterable, name: str | None = None):
-        """Add a new secret to be masked to this filter instance."""
+    @cached_property
+    def _mask_adapter(self) -> None | Callable:
+        """Pulls the secret mask adapter from config.
+
+        This lives in a function here to be cached and only hit the config once.
+        """
         from airflow.configuration import conf
 
-        test_mode: bool = conf.getboolean("core", "unit_test_mode")
+        return conf.getimport("logging", "secret_mask_adapter", fallback=None)
+
+    @cached_property
+    def _test_mode(self) -> bool:
+        """Pulls the unit test mode flag from config.
+
+        This lives in a function here to be cached and only hit the config once.
+        """
+        from airflow.configuration import conf
+
+        return conf.getboolean("core", "unit_test_mode")
+
+    def _add_adaptations(self, secret: str | dict | Iterable, name: str | None = None):
+        """Adds any adaptations to the secret that should be masked."""
+        if self._mask_adapter:
+            # This can return an iterable of secrets to mask OR a single secret as a string
+            secret_or_secrets = self._mask_adapter(secret)
+
+            if not isinstance(secret_or_secrets, str):
+                # if its not a string, it must be an iterable
+                for secret in secret_or_secrets:
+                    self.add_mask(secret, name, add_adaptations=False)
+            else:
+                self.add_mask(secret_or_secrets, name, add_adaptations=False)

Review Comment:
   ```suggestion
       def _adaptations(self, secret: str) -> list[str]:
           """Adds any adaptations to the secret that should be masked."""
           if not self._mask_adapter:
               return [secret]
           # This can return an iterable of secrets to mask OR a single secret as a string
           secret_or_secrets = self._mask_adapter(secret)
   
           if not isinstance(secret_or_secrets, str):
               # if its not a string, it must be an iterable
               return secret_or_secrets
          else:
                return [secret]
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] csm10495 commented on pull request #28239: Add a new config for adapting masked secrets to make it easier to prevent secret leakage in logs

Posted by GitBox <gi...@apache.org>.
csm10495 commented on PR #28239:
URL: https://github.com/apache/airflow/pull/28239#issuecomment-1347421277

   @ashb check again. I think I've addressed your comments. I also cleaned up a tiny bit in the area


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] csm10495 commented on pull request #28239: Add a new config for adapting masked secrets to make it easier to prevent secret leakage in logs

Posted by GitBox <gi...@apache.org>.
csm10495 commented on PR #28239:
URL: https://github.com/apache/airflow/pull/28239#issuecomment-1349926073

   @ashb one more round. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ashb commented on a diff in pull request #28239: Add a new config for adapting masked secrets to make it easier to prevent secret leakage in logs

Posted by GitBox <gi...@apache.org>.
ashb commented on code in PR #28239:
URL: https://github.com/apache/airflow/pull/28239#discussion_r1047671128


##########
airflow/utils/log/secrets_masker.py:
##########
@@ -240,17 +241,53 @@ def redact(self, item: Redactable, name: str | None = None) -> Redacted:
         """
         return self._redact(item, name, depth=0)
 
-    def add_mask(self, secret: str | dict | Iterable, name: str | None = None):
-        """Add a new secret to be masked to this filter instance."""
+    @cached_property
+    def _mask_adapter(self) -> None | Callable:
+        """Pulls the secret mask adapter from config.
+
+        This lives in a function here to be cached and only hit the config once.
+        """
         from airflow.configuration import conf
 
-        test_mode: bool = conf.getboolean("core", "unit_test_mode")
+        return conf.getimport("logging", "secret_mask_adapter", fallback=None)
+
+    @cached_property
+    def _test_mode(self) -> bool:
+        """Pulls the unit test mode flag from config.
+
+        This lives in a function here to be cached and only hit the config once.
+        """
+        from airflow.configuration import conf
+
+        return conf.getboolean("core", "unit_test_mode")
+
+    def _add_adaptations(self, secret: str | dict | Iterable, name: str | None = None):
+        """Adds any adaptations to the secret that should be masked."""
+        if self._mask_adapter:
+            # This can return an iterable of secrets to mask OR a single secret as a string
+            secret_or_secrets = self._mask_adapter(secret)
+
+            if not isinstance(secret_or_secrets, str):
+                # if its not a string, it must be an iterable
+                for secret in secret_or_secrets:
+                    self.add_mask(secret, name, add_adaptations=False)
+            else:
+                self.add_mask(secret_or_secrets, name, add_adaptations=False)

Review Comment:
   ```suggestion
       def _adaptations(self, secret: str):
           """Adds any adaptations to the secret that should be masked."""
           if not self._mask_adapter:
               return [secret]
           # This can return an iterable of secrets to mask OR a single secret as a string
           secret_or_secrets = self._mask_adapter(secret)
   
           if not isinstance(secret_or_secrets, str):
               # if its not a string, it must be an iterable
               return secret_or_secrets
          else:
                return [secret]
   ```



##########
airflow/utils/log/secrets_masker.py:
##########
@@ -240,17 +241,53 @@ def redact(self, item: Redactable, name: str | None = None) -> Redacted:
         """
         return self._redact(item, name, depth=0)
 
-    def add_mask(self, secret: str | dict | Iterable, name: str | None = None):
-        """Add a new secret to be masked to this filter instance."""
+    @cached_property
+    def _mask_adapter(self) -> None | Callable:
+        """Pulls the secret mask adapter from config.
+
+        This lives in a function here to be cached and only hit the config once.
+        """
         from airflow.configuration import conf
 
-        test_mode: bool = conf.getboolean("core", "unit_test_mode")
+        return conf.getimport("logging", "secret_mask_adapter", fallback=None)
+
+    @cached_property
+    def _test_mode(self) -> bool:
+        """Pulls the unit test mode flag from config.
+
+        This lives in a function here to be cached and only hit the config once.
+        """
+        from airflow.configuration import conf
+
+        return conf.getboolean("core", "unit_test_mode")
+
+    def _add_adaptations(self, secret: str | dict | Iterable, name: str | None = None):
+        """Adds any adaptations to the secret that should be masked."""
+        if self._mask_adapter:
+            # This can return an iterable of secrets to mask OR a single secret as a string
+            secret_or_secrets = self._mask_adapter(secret)
+
+            if not isinstance(secret_or_secrets, str):
+                # if its not a string, it must be an iterable
+                for secret in secret_or_secrets:
+                    self.add_mask(secret, name, add_adaptations=False)
+            else:
+                self.add_mask(secret_or_secrets, name, add_adaptations=False)
+
+    def add_mask(self, secret: str | dict | Iterable, name: str | None = None, add_adaptations: bool = True):
+        """Add a new secret to be masked to this filter instance.
+
+        If add_adaptations is True, the secret mask adapter will be used to add adaptations for the secret
+        as well.
+        """
         if isinstance(secret, dict):
             for k, v in secret.items():
                 self.add_mask(v, k)
         elif isinstance(secret, str):
-            if not secret or (test_mode and secret in SECRETS_TO_SKIP_MASKING_FOR_TESTS):
+            if not secret or (self._test_mode and secret in SECRETS_TO_SKIP_MASKING_FOR_TESTS):
                 return
+            if add_adaptations:
+                self._add_adaptations(secret, name)
             pattern = re.escape(secret)
             if pattern not in self.patterns and (not name or should_hide_value_for_key(name)):
                 self.patterns.add(pattern)

Review Comment:
   This suggestion won't blindly apply, but we should remove the current `self.replacer = re.compile("|".join(self.patterns))` line
   
   ```suggestion
               new_mask = False
               for s in self._adaptation(secret):
                   pattern = re.escape(secret)
                   if pattern not in self.patterns and (not name or should_hide_value_for_key(name)):
                       self.patterns.add(pattern)
                       new_mask = True
           if new_mask:
               self.replacer = re.compile("|".join(self.patterns))                    
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ashb commented on a diff in pull request #28239: Add a new config for adapting masked secrets to make it easier to prevent secret leakage in logs

Posted by GitBox <gi...@apache.org>.
ashb commented on code in PR #28239:
URL: https://github.com/apache/airflow/pull/28239#discussion_r1047671128


##########
airflow/utils/log/secrets_masker.py:
##########
@@ -240,17 +241,53 @@ def redact(self, item: Redactable, name: str | None = None) -> Redacted:
         """
         return self._redact(item, name, depth=0)
 
-    def add_mask(self, secret: str | dict | Iterable, name: str | None = None):
-        """Add a new secret to be masked to this filter instance."""
+    @cached_property
+    def _mask_adapter(self) -> None | Callable:
+        """Pulls the secret mask adapter from config.
+
+        This lives in a function here to be cached and only hit the config once.
+        """
         from airflow.configuration import conf
 
-        test_mode: bool = conf.getboolean("core", "unit_test_mode")
+        return conf.getimport("logging", "secret_mask_adapter", fallback=None)
+
+    @cached_property
+    def _test_mode(self) -> bool:
+        """Pulls the unit test mode flag from config.
+
+        This lives in a function here to be cached and only hit the config once.
+        """
+        from airflow.configuration import conf
+
+        return conf.getboolean("core", "unit_test_mode")
+
+    def _add_adaptations(self, secret: str | dict | Iterable, name: str | None = None):
+        """Adds any adaptations to the secret that should be masked."""
+        if self._mask_adapter:
+            # This can return an iterable of secrets to mask OR a single secret as a string
+            secret_or_secrets = self._mask_adapter(secret)
+
+            if not isinstance(secret_or_secrets, str):
+                # if its not a string, it must be an iterable
+                for secret in secret_or_secrets:
+                    self.add_mask(secret, name, add_adaptations=False)
+            else:
+                self.add_mask(secret_or_secrets, name, add_adaptations=False)

Review Comment:
   ```suggestion
       def _adaptations(self, secret: str) -> list[str]:
           """Adds any adaptations to the secret that should be masked."""
           yield secret
           if not self._mask_adapter:
               return
           # This can return an iterable of secrets to mask OR a single secret as a string
           secret_or_secrets = self._mask_adapter(secret)
   
           if not isinstance(secret_or_secrets, str):
               # if its not a string, it must be an iterable
               yield from secret_or_secrets
          else:
                yield secret_or_secrets
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] csm10495 commented on a diff in pull request #28239: Add a new config for adapting masked secrets to make it easier to prevent secret leakage in logs

Posted by GitBox <gi...@apache.org>.
csm10495 commented on code in PR #28239:
URL: https://github.com/apache/airflow/pull/28239#discussion_r1047661033


##########
airflow/utils/log/secrets_masker.py:
##########
@@ -240,17 +241,53 @@ def redact(self, item: Redactable, name: str | None = None) -> Redacted:
         """
         return self._redact(item, name, depth=0)
 
-    def add_mask(self, secret: str | dict | Iterable, name: str | None = None):
-        """Add a new secret to be masked to this filter instance."""
+    @cached_property
+    def _mask_adapter(self) -> None | Callable:
+        """Pulls the secret mask adapter from config.
+
+        This lives in a function here to be cached and only hit the config once.
+        """
         from airflow.configuration import conf
 
-        test_mode: bool = conf.getboolean("core", "unit_test_mode")
+        return conf.getimport("logging", "secret_mask_adapter", fallback=None)
+
+    @cached_property
+    def _test_mode(self) -> bool:
+        """Pulls the unit test mode flag from config.
+
+        This lives in a function here to be cached and only hit the config once.
+        """
+        from airflow.configuration import conf
+
+        return conf.getboolean("core", "unit_test_mode")
+
+    def _add_adaptations(self, secret: str | dict | Iterable, name: str | None = None):
+        """Adds any adaptations to the secret that should be masked."""
+        if self._mask_adapter:
+            # This can return an iterable of secrets to mask OR a single secret as a string
+            secret_or_secrets = self._mask_adapter(secret)
+
+            if not isinstance(secret_or_secrets, str):
+                # if its not a string, it must be an iterable
+                for secret in secret_or_secrets:
+                    self.add_mask(secret, name, add_adaptations=False)
+            else:
+                self.add_mask(secret_or_secrets, name, add_adaptations=False)
+
+    def add_mask(self, secret: str | dict | Iterable, name: str | None = None, add_adaptations: bool = True):

Review Comment:
   It breaks the recursion problem. add_adaptations gets set to False when add_adaptations calls add_mask



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org