You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "jurovee (via GitHub)" <gi...@apache.org> on 2023/03/11 13:23:45 UTC

[GitHub] [airflow] jurovee opened a new issue, #30039: Sensitive variable not masked in task logs when named with _ENCODED suffix

jurovee opened a new issue, #30039:
URL: https://github.com/apache/airflow/issues/30039

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### What happened
   
   **Airflow 2.4.3**
   
   Sensitive variables with a name like **ACCOUNT_PASSWORD_ENCODED** (for url-encoded versions of passwords) are not being masked properly in task logs or rendered templates.
   
   Each of these variables have in our case their counterparts of name **ACCOUNT_PASSWORD** and these are masked **without any issues**.
   
   `AIRFLOW__CORE__HIDE_SENSITIVE_VAR_CONN_FIELDS` is set to **True** and I also tried to add custom field "encoded" or "password_encoded" or "PASSWORD_ENCODED" to `AIRFLOW__CORE__SENSITIVE_VAR_CONN_NAMES`, e.g.: 
   
   `AIRFLOW__CORE__SENSITIVE_VAR_CONN_NAMES: "encoded,password_encoded"`
   
   No impact on masking unfortunately.
   
   I also tried to run `airflow.utils.log.secrets_masker.should_hide_value_for_key('ACCOUNT_PASSWORD_ENCODED')` from Airflow container and it results in True, so no idea why it's not getting hidden.
   
   Could it be related to `%` characters in the variable value or something?
   
   ### What you think should happen instead
   
   Sensitive variables with a name like **ACCOUNT_PASSWORD_ENCODED** (for url-encoded versions of passwords) should be masked in Airflow logs or rendered templates as they contain a "magic" substring **PASSWORD**.
   
   ### How to reproduce
   
   Create a variable named **SOMETHING_PASSWORD_ENCODED** in your Airflow instance and try to use it in some task, e.g. BashOperator command echo {SOMETHING_PASSWORD_ENCODED}. Similarly create a variable without **_ENCODED** suffix and do the same. The first one is not being masked, the second one is.
   
   ### Operating System
   
   K8S Debian 10 Linux Container
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Other 3rd-party Helm chart
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jurovee commented on issue #30039: Sensitive variable not masked in task logs when containing URL encoded string

Posted by "jurovee (via GitHub)" <gi...@apache.org>.
jurovee commented on issue #30039:
URL: https://github.com/apache/airflow/issues/30039#issuecomment-1465285604

   That works indeed. I am a bit confused, if the sensitive value is present in a "sensitive" variable, using the value itself in any form, e.g. printing it - should mask it either way? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #30039: Sensitive variable not masked in task logs when containing URL encoded string

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on issue #30039:
URL: https://github.com/apache/airflow/issues/30039#issuecomment-1465289065

   Closing it 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jurovee commented on issue #30039: Sensitive variable not masked in task logs when containing URL encoded string

Posted by "jurovee (via GitHub)" <gi...@apache.org>.
jurovee commented on issue #30039:
URL: https://github.com/apache/airflow/issues/30039#issuecomment-1465291994

   Got it, was under false and noob impression that Airflow webserver just somehow sees a string (in logs for example) and if it's contained in a sensitive variable value it will automatically hide it somehow, well it's a bit more complicated indeed ;) thanks both for clarifying. Gonna update our codebase accordingly. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jurovee commented on issue #30039: Sensitive variable not masked in task logs when containing URL encoded string

Posted by "jurovee (via GitHub)" <gi...@apache.org>.
jurovee commented on issue #30039:
URL: https://github.com/apache/airflow/issues/30039#issuecomment-1465282119

   @hussein-awala I just checked on 2.5.1:
   
   ```
   >>> from airflow.models import Variable
   >>> Variable.get('ABC_PASSWORD')
   'w7.%40jp%295%24KCEvrR~'
   ```
   
   BashOperator Task
   command: `echo 'hello, password is w7.%40jp%295%24KCEvrR~'`
   
   Airflow logs from the task:
   
   ```
   [2023-03-12, 20:36:12 CET] {subprocess.py:75} INFO - Running command: ['/bin/bash', '-c', "echo 'hello, password is w7.%40jp%295%24KCEvrR~'"]
   [2023-03-12, 20:36:12 CET] {subprocess.py:86} INFO - Output:
   [2023-03-12, 20:36:12 CET] {subprocess.py:93} INFO - hello, password is w7.%40jp%295%24KCEvrR~
   [2023-03-12, 20:36:12 CET] {subprocess.py:97} INFO - Command exited with return code 0
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jurovee commented on issue #30039: Sensitive variable not masked in task logs when named with _ENCODED suffix

Posted by "jurovee (via GitHub)" <gi...@apache.org>.
jurovee commented on issue #30039:
URL: https://github.com/apache/airflow/issues/30039#issuecomment-1464933892

   @hussein-awala I'll try do that on Monday sure, but digging into it more, it just seems to me it's somehow related to specific value of a variable, not really the name. Can you please also check with a value being a URL-encoded string? E.g. URL-encoded string `w7.%40jp%295%24KCEvrR~` from some random string I've just generated: `w7.@jp)5$KCEvrR~`. Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #30039: Sensitive variable not masked in task logs when containing URL encoded string

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on issue #30039:
URL: https://github.com/apache/airflow/issues/30039#issuecomment-1465288940

   No. How would you want to do it ? You would have to not only return the value but also remember that it was  retrieved from. A sensitively named variable.  Once you retrieve it, it looses the 'source association'. 
   
   You would have to always send the variable together with some metadata that would tell the provenience of the string and that would have to be implemented at the level of your code to verify the metadata before printing.
   
   There is no 'transparent' way where it can be handled - the best we can do is when this is a code which w can check with JiNJa before it gets Interpreted. 
   
   Probably it could be done using some super arcane methods (with a lot of performance overhead - where you would store retrieved variables and metadata about them but that would be terribly slow and complex and like it would not be possible to catch all usages of such retrieved value.
   
   But if you would like to attempt to make such an exercise - feel free to open PR :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hussein-awala commented on issue #30039: Sensitive variable not masked in task logs when named with _ENCODED suffix

Posted by "hussein-awala (via GitHub)" <gi...@apache.org>.
hussein-awala commented on issue #30039:
URL: https://github.com/apache/airflow/issues/30039#issuecomment-1464935343

   I just checked with these two values, and they are masked in the log.
   
   I let you test with Airflow 2.5.1, then confirm that it works or provide some new values to reproduce the issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hussein-awala commented on issue #30039: Sensitive variable not masked in task logs when containing URL encoded string

Posted by "hussein-awala (via GitHub)" <gi...@apache.org>.
hussein-awala commented on issue #30039:
URL: https://github.com/apache/airflow/issues/30039#issuecomment-1465283924

   This is normal. When you load the variable via the method `Variable.get`, you will get a python string, then when you use it in the operator, Airflow considers it as a normal string.
   
   Could you try with jinja templating?
   ```python
   BashOperator(
       task_id="bash",
       bash_command="echo `{{ var.value.get('ABC_PASSWORD') }}`"
   )
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hussein-awala commented on issue #30039: Sensitive variable not masked in task logs when named with _ENCODED suffix

Posted by "hussein-awala (via GitHub)" <gi...@apache.org>.
hussein-awala commented on issue #30039:
URL: https://github.com/apache/airflow/issues/30039#issuecomment-1464929321

   I cannot reproduce it with Airflow 2.5.1, can you try to upgrade the latest version and check if it works?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk closed issue #30039: Sensitive variable not masked in task logs when containing URL encoded string

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk closed issue #30039: Sensitive variable not masked in task logs when containing URL encoded string
URL: https://github.com/apache/airflow/issues/30039


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org