You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/01/27 17:12:30 UTC

[GitHub] [airflow] josh-fell opened a new pull request #21162: Change logging level details of connection info in `get_connection()`

josh-fell opened a new pull request #21162:
URL: https://github.com/apache/airflow/pull/21162


   Related: #19883
   
   Currently task logs can contain all of connection details depending on how the associated connection to the task is configured (i.e. if `host` is a provided connection attr). These details are logged at the INFO level but seem more appropriate for debugging.
   
   This PR intends to clean up this connection logging a little. The INFO level logging will contain only the connection ID that is used while the details of the connection are changed to the DEBUG level (and still masked). Additionally the connection ID info is logged regardless of the provided connection attrs (i.e. removing the `host` check). Lastly this change also has a small added benefit of not accidentally or unknowingly exposing connection info that users do not want in their logs _first_ rather than the details be exposed and then having to setup configuration to mask them later (assuming the exposure is noticed at all).
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #21162: Change logging level details of connection info in `get_connection()`

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #21162:
URL: https://github.com/apache/airflow/pull/21162#issuecomment-1023560760


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #21162: Change logging level details of connection info in `get_connection()`

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #21162:
URL: https://github.com/apache/airflow/pull/21162#issuecomment-1026187017


   > Oh that makes perfect sense. Thanks for all the context @potiuk 🚀
   
   No problem. I am writing all those "Architecture decision records" in https://github.com/apache/airflow/tree/main/dev/breeze/doc/adr  - during the Breeze2 rewrite project, so this is actually a good next one to add. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #21162: Change logging level details of connection info in `get_connection()`

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #21162:
URL: https://github.com/apache/airflow/pull/21162#issuecomment-1026049782


   > Is it expected that the `Providers` tests in the "Tests: Always API Core Other CLI Providers Integration" suite doesn't run for MySQL and MSSQL? I was surprised that only Postgres and Sqlite failed.
   
   Yes. That was done as part of stabilizing our flaky CI Tests.
   
   Both MySQL and MSSQL (despite very aggressive optimisation of the configuration of the dockerized versions of those) require much more memory to run than Postgres and SQLite. That lead to Jobs failing quite often when they were run on Public Runners.  That's why this type of tests is disabled now for those two databases (but only on Public Runners). 
   
   In fact, it is actually even printed there. If you unfold "Determine how to run the tests" in those tests you will see this:
   
   ![image](https://user-images.githubusercontent.com/595491/151846256-fff9c662-513b-4f6c-bc0f-34d51c5be215.png)
   
   You will find the logic controlling it here https://github.com/apache/airflow/blob/906d710060ebfe893ef3d7cddf00d2c49c7998fa/scripts/ci/testing/ci_run_airflow_testing.sh#L68
   
   Also "all" tests will run in "main" after the change is merged. Those tests are run on our 64GB mem self-hosted machines, that have enough CPUS and memory to run all the tests always in parallell. So we will see failing main in case those "Providers" tests run on MySQL or MsSQL woudl fail (which is highly unlikely because Provider tests are not supposed to be "Metadata-DB" dependent. Yet we "just in case" always run all tests in main. 
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #21162: Change logging level details of connection info in `get_connection()`

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #21162:
URL: https://github.com/apache/airflow/pull/21162#issuecomment-1026049782


   > Is it expected that the `Providers` tests in the "Tests: Always API Core Other CLI Providers Integration" suite doesn't run for MySQL and MSSQL? I was surprised that only Postgres and Sqlite failed.
   
   Yes. That was done as part of stabilizing our flaky CI Tests.
   
   Both MySQL and MSSQL (despite very aggressive optimisation of the configuration of the dockerized versions of those) require much more memory to run than Postgres and SQLite. That lead to Jobs failing quite often with `Exit code 137` (means memory run out) when they were run on Public Runners.  That's why this type of tests is disabled now for those two databases (but only on Public Runners). 
   
   In fact, it is actually even printed there. If you unfold "Determine how to run the tests" in those tests you will see this:
   
   ![image](https://user-images.githubusercontent.com/595491/151846256-fff9c662-513b-4f6c-bc0f-34d51c5be215.png)
   
   You will find the logic controlling it here https://github.com/apache/airflow/blob/906d710060ebfe893ef3d7cddf00d2c49c7998fa/scripts/ci/testing/ci_run_airflow_testing.sh#L68
   
   Also "all" tests will run in "main" after the change is merged. Those tests are run on our 64GB mem self-hosted machines, that have enough CPUS and memory to run all the tests always in parallell. So we will see failing main in case those "Providers" tests run on MySQL or MsSQL woudl fail (which is highly unlikely because Provider tests are not supposed to be "Metadata-DB" dependent. Yet we "just in case" always run all tests in main. 
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #21162: Change logging level details of connection info in `get_connection()`

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #21162:
URL: https://github.com/apache/airflow/pull/21162#issuecomment-1040171208


   Just old docker-compose problem already fixed. Merging


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed pull request #21162: Change logging level details of connection info in `get_connection()`

Posted by GitBox <gi...@apache.org>.
potiuk closed pull request #21162:
URL: https://github.com/apache/airflow/pull/21162


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell commented on pull request #21162: Change logging level details of connection info in `get_connection()`

Posted by GitBox <gi...@apache.org>.
josh-fell commented on pull request #21162:
URL: https://github.com/apache/airflow/pull/21162#issuecomment-1026183490


   Oh that makes perfect sense. Thanks for all the context @potiuk 🚀 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #21162: Change logging level details of connection info in `get_connection()`

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #21162:
URL: https://github.com/apache/airflow/pull/21162#issuecomment-1026049782


   > Is it expected that the `Providers` tests in the "Tests: Always API Core Other CLI Providers Integration" suite doesn't run for MySQL and MSSQL? I was surprised that only Postgres and Sqlite failed.
   
   Yes. That was done as part of stabilizing our flaky CI Tests.
   
   Both MySQL and MSSQL (despite very aggressive optimisation of the configuration of the dockerized versions of those) require much more memory to run than Postgres and SQLite. That lead to Jobs failing quite often when they were run on Public Runners.  That's why they are disabled now for those two databases (but only on Public Runners). 
   
   In fact, it is actually even printed there. If you unfold "Determine how to run the tests" in those tests you will see this:
   
   ![image](https://user-images.githubusercontent.com/595491/151846256-fff9c662-513b-4f6c-bc0f-34d51c5be215.png)
   
   You will find the logic controlling it here https://github.com/apache/airflow/blob/906d710060ebfe893ef3d7cddf00d2c49c7998fa/scripts/ci/testing/ci_run_airflow_testing.sh#L68
   
   Also "all" tests will run in "main" after the change is merged. Those tests are run on our 64GB mem self-hosted machines, that have enough CPUS and memory to run all the tests always in parallell. So we will see failing main in case those "Providers" tests run on MySQL or MsSQL woudl fail (which is highly unlikely because Provider tests are not supposed to be "Metadata-DB" dependent. Yet we "just in case" always run all tests in main. 
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #21162: Change logging level details of connection info in `get_connection()`

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #21162:
URL: https://github.com/apache/airflow/pull/21162


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #21162: Change logging level details of connection info in `get_connection()`

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #21162:
URL: https://github.com/apache/airflow/pull/21162#issuecomment-1040171208


   Just old docker-compose problem already fixed. Merging


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #21162: Change logging level details of connection info in `get_connection()`

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #21162:
URL: https://github.com/apache/airflow/pull/21162


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell commented on pull request #21162: Change logging level details of connection info in `get_connection()`

Posted by GitBox <gi...@apache.org>.
josh-fell commented on pull request #21162:
URL: https://github.com/apache/airflow/pull/21162#issuecomment-1025944035


   Is it expected that the `Providers` tests in the "Tests: Always API Core Other CLI Providers Integration" suite doesn't run for MySQL and MSSQL? I was surprised that only Postgres and Sqlite failed.
   
   ![image](https://user-images.githubusercontent.com/48934154/151829183-d480490e-7c12-46ac-9669-c74226cb09fe.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #21162: Change logging level details of connection info in `get_connection()`

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #21162:
URL: https://github.com/apache/airflow/pull/21162#issuecomment-1026051980


   The same tests on self-hosted runners look like that:
   
   ![image](https://user-images.githubusercontent.com/595491/151847350-b8dc4058-f82d-48d9-ab53-d266087a4890.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org