You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/06/30 19:08:03 UTC

[GitHub] [airflow] Unit03 opened a new issue #9595: SparkSubmitOperator only masks one "form" of password arguments

Unit03 opened a new issue #9595:
URL: https://github.com/apache/airflow/issues/9595


   **Apache Airflow version**: 1.10.9, 1.10.10, trunk
   
   - **OS** (e.g. from /etc/os-release): Linux
   - **Others**: Bash/sh
   
   **What happened**:
   
   Password masking was added to `SparkSubmitOperator` (`SparkSubmitHook`, to be precise) in December 2019 (under [AIRFLOW-6350](https://issues.apache.org/jira/browse/AIRFLOW-6350); PR: #6917) - but it only masks passwords as long as they are in the `--foo.password='value'` form; i.e. it must be put in single-quotes and be joined with the argument's name via an equal sign.
   
   **What you expected to happen**:
   
   I would expect the forms a) with double-quotes or with no quotes at all b) with whitespace instead of an equal sign to also be covered by this mechanism, e.g.
   * `--foo.password=value`
   * `--foo.password="value"`
   * `--foo.password 'value'`
   * `--foo.password value`
   * `--foo.password "value"`
   
   But I may be missing something. Is there any reason [the initial version](https://github.com/apache/airflow/pull/6917) only covers the single-quoted-with-equal-sign form? The regular expression used in the masking code ([1.10.9 version](https://github.com/apache/airflow/blob/1.10.9/airflow/contrib/hooks/spark_submit_hook.py#L229-L236), [trunk version](https://github.com/apache/airflow/blob/master/airflow/providers/apache/spark/hooks/spark_submit.py#L236-L243)) looks pretty intentional:
   
   ```python
       def _mask_cmd(self, connection_cmd):
           # Mask any password related fields in application args with key value pair
           # where key contains password (case insensitive), e.g. HivePassword='abc'
   
           connection_cmd_masked = re.sub(
               r"(\S*?(?:secret|password)\S*?\s*=\s*')[^']*(?=')",
               r'\1******', ' '.join(connection_cmd), flags=re.I)
   ```
   
   **How to reproduce it**:
   
   ```python
   from airflow.contrib.operators.spark_submit_operator import SparkSubmitOperator  # Airflow 1.10.9
   
   dag = DAG(...)
   SparkSubmitOperator(
       ...,
       conf={"spark.foo.password": "this_should_get_masked_but_it_doesnt"},
       dag=dag,
   )
   ```
   
   Running such a task will leak the password into Airflow logs.
   
   **Anything else we need to know**:
   
   Again, I may be missing something, e.g. sth OS-specific. I'd be happy to learn something here. :)
   
   In case all/part of the other forms I mentioned should also get the masking treatment, [I have a change ready for opening a PR](https://github.com/Unit03/airflow/commits/mask-not-single-quoted-passwords).
   
   (Note there's no JIRA issue referenced in the commit messages: I cannot create issues in [Airflow's Jira](https://issues.apache.org/jira/projects/AIRFLOW/summary) for some reason)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #9595: SparkSubmitOperator only masks one "form" of password arguments

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #9595:
URL: https://github.com/apache/airflow/issues/9595#issuecomment-651987209


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Unit03 commented on issue #9595: SparkSubmitOperator only masks one "form" of password arguments

Posted by GitBox <gi...@apache.org>.
Unit03 commented on issue #9595:
URL: https://github.com/apache/airflow/issues/9595#issuecomment-652601598


   All right, then, PR opened. :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] tooptoop4 commented on issue #9595: SparkSubmitOperator only masks one "form" of password arguments

Posted by GitBox <gi...@apache.org>.
tooptoop4 commented on issue #9595:
URL: https://github.com/apache/airflow/issues/9595#issuecomment-657059406


   can this be closed?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #9595: SparkSubmitOperator only masks one "form" of password arguments

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #9595:
URL: https://github.com/apache/airflow/issues/9595#issuecomment-657099625


   Looks like!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #9595: SparkSubmitOperator only masks one "form" of password arguments

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #9595:
URL: https://github.com/apache/airflow/issues/9595#issuecomment-657099776


   FYI @Unit03. You can put `Closes #ISSUE` in the commit message and it will close related issue at merge :).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed issue #9595: SparkSubmitOperator only masks one "form" of password arguments

Posted by GitBox <gi...@apache.org>.
potiuk closed issue #9595:
URL: https://github.com/apache/airflow/issues/9595


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] tooptoop4 commented on issue #9595: SparkSubmitOperator only masks one "form" of password arguments

Posted by GitBox <gi...@apache.org>.
tooptoop4 commented on issue #9595:
URL: https://github.com/apache/airflow/issues/9595#issuecomment-652078141


   @Unit03 pls raise the PR. My original PR that you mentioned was very crude (as you noticed!), but better than nothing :) Was done in that format because all my dags use that 1 specific format of sending conf


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org