You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/10/22 19:52:00 UTC

[jira] [Commented] (AIRFLOW-5720) don't call _get_connections_from_db in TestCloudSqlDatabaseHook

    [ https://issues.apache.org/jira/browse/AIRFLOW-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957306#comment-16957306 ] 

ASF GitHub Bot commented on AIRFLOW-5720:
-----------------------------------------

dstandish commented on pull request #6390: [AIRFLOW-5720] don't call private method _get_connections_from_db
URL: https://github.com/apache/airflow/pull/6390
 
 
   * tests are mocking lower-level than they need to
       - Tests were mocking airflow.hook.BaseHook.get_connections.
       - Instead they can mock airflow.gcp.hooks.cloud_sql.CloudSqlDatabaseHook.get_connection which is more direct.
   * should not reference private method
       - This is an impediment to refactoring of connections / creds.
   * Tests had complexity that did not add a benefit
       - only one of the three connections in _setup_connections actually had any impact on the test
       - supplying a param for default_gcp_project_id is misleading because it has no impact on the test
   
   See Jira ticket [AIRFLOW-5720](https://issues.apache.org/jira/browse/AIRFLOW-5720) for more info.
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [AIRFLOW-5720](https://issues.apache.org/jira/browse/AIRFLOW-5720) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
     - https://issues.apache.org/jira/browse/AIRFLOW-5720
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI changes:
   
   Issues with this test class:
   
   **tests are mocking lower-level than they need to**
   Tests were mocking airflow.hook.BaseHook.get_connections.
   Instead they can mock airflow.gcp.hooks.cloud_sql.CloudSqlDatabaseHook.get_connection which is more direct.
   
   **should not reference private method**
   
   This is an impediment to refactoring of connections / creds.
   
   **Tests had complexity that did not add a benefit**
   
   They all had this bit:
   
           self._setup_connections(get_connections, uri)
           gcp_conn_id = 'google_cloud_default'
           hook = CloudSqlDatabaseHook(
               default_gcp_project_id=BaseHook.get_connection(gcp_conn_id).extra_dejson.get(
                   'extra__google_cloud_platform__project')
           )
   _setup_connections was like this:
   
       @staticmethod
       def _setup_connections(get_connections, uri):
           gcp_connection = mock.MagicMock()
           gcp_connection.extra_dejson = mock.MagicMock()
           gcp_connection.extra_dejson.get.return_value = 'empty_project'
           cloudsql_connection = Connection()
           cloudsql_connection.parse_from_uri(uri)
           cloudsql_connection2 = Connection()
           cloudsql_connection2.parse_from_uri(uri)
           get_connections.side_effect = [[gcp_connection], [cloudsql_connection],
                                          [cloudsql_connection2]]
   Issues here are as follows.
   
   1. no test ever used the third side effect
   
   2. the first side effect does not help us; `default_gcp_project_id` is irrelevant
   
   Only one of the three connections in `_setup_connections` has any impact on the test.
   
   The call of `BaseHook.get_connection` only serves to discard the first connection in mock side effect list, `gcp_connection`.
   
   The second connection is the one that matters, and it is returned when `CloudSqlDatabaseHook` calls `self.get_connection` during init. Since it is a mock side effect, it doesn't matter what value is given for conn_id. So the `CloudSqlDatabaseHook` init param `default_gcp_project_id` has no consequence. And because it has no consequence, we should not supply a value for it because this is misleading.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes how to use it.
     - All the public functions and the classes in the PR contain docstrings that explain what it does
     - If you implement backwards incompatible changes, please leave a note in the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so we can assign it to a appropriate release
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> don't call _get_connections_from_db in TestCloudSqlDatabaseHook
> ---------------------------------------------------------------
>
>                 Key: AIRFLOW-5720
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5720
>             Project: Apache Airflow
>          Issue Type: New Feature
>          Components: gcp
>    Affects Versions: 1.10.5
>            Reporter: Daniel Standish
>            Assignee: Daniel Standish
>            Priority: Major
>
> Issues with this test class:
> *tests are mocking lower-level than they need to*
> Tests were mocking {{airflow.hook.BaseHook.get_connections}}.
> Instead they can mock {{airflow.gcp.hooks.cloud_sql.CloudSqlDatabaseHook.get_connection}} which is more direct.
> *should not reference private method*
> This is an impediment to refactoring of connections / creds.
> *Tests had complexity that did not add a benefit*
> They all had this bit:
> {code:python}
>         self._setup_connections(get_connections, uri)
>         gcp_conn_id = 'google_cloud_default'
>         hook = CloudSqlDatabaseHook(
>             default_gcp_project_id=BaseHook.get_connection(gcp_conn_id).extra_dejson.get(
>                 'extra__google_cloud_platform__project')
>         )
> {code}
> {{_setup_connections}} was like this:
> {code:python}
>     @staticmethod
>     def _setup_connections(get_connections, uri):
>         gcp_connection = mock.MagicMock()
>         gcp_connection.extra_dejson = mock.MagicMock()
>         gcp_connection.extra_dejson.get.return_value = 'empty_project'
>         cloudsql_connection = Connection()
>         cloudsql_connection.parse_from_uri(uri)
>         cloudsql_connection2 = Connection()
>         cloudsql_connection2.parse_from_uri(uri)
>         get_connections.side_effect = [[gcp_connection], [cloudsql_connection],
>                                        [cloudsql_connection2]]
> {code}
> Issues here are as follows.
> 1. no test ever used the third side effect
> 2. the first side effect does not help us; {{default_gcp_project_id}} is irrelevant
> Only one of the three connections in {{_setup_connections}} has any impact on the test.
> The call of {{BaseHook.get_connection}} only serves to discard the first connection in mock side effect list, {{gcp_connection}}.  
> The second connection is the one that matters, and it is returned when {{CloudSqlDatabaseHook}} calls `self.get_connection` during init.  Since it is a mock side effect, it doesn't matter what value is given for conn_id.  So the {{CloudSqlDatabaseHook}} init param {{default_gcp_project_id}} has no consequence.  And because it has no consequence, we should not supply a value  for it because this is misleading.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)