You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/08/12 23:53:55 UTC

[GitHub] [airflow] dinigo opened a new pull request #17592: Support for passing arguments to SqlSensor underlying hooks

dinigo opened a new pull request #17592:
URL: https://github.com/apache/airflow/pull/17592


   closes: #13750
   related: #17315
   
   `SqlSensor` relies on underlying hooks that are automatically injected (DI) depending on the connection type provided to the `SqlSensor`. However, from time to time (but mainly in `BigqueryHook`) it is needed to pre-configure the "to-be-injected" hook because some default params do not fit the current config.
   
   For example, if using SqlSensor with Bigquery it by default configures the SQL dialect to "legacy SQL", and there is no way to change this through configuration. Instead the SqlSensor has to be extended and some functions overwritten (as mentioned in the #13750).
   
   To be able to customise the underlying hook there are 2 approaches to fix this:
   * Bring your own hook: instantiate a hook and provide it to the sensor already configured so it just checks that it inherits from `DbApiHook` but doesn't try to instantiate on its own.
   * Pass over the params you want SqlSensor to use to instantiate the hook.
   
   Either one has it's drawbacks:
   * Cannot be used if DAG is defined with yaml/json/config-file, since to instantiate a Hook requires "intelligence", relying solely in config seems more maintainable in the future.
   * The "tell us what's wrong, we got you covered" does not follow very well the "[open but closed](https://en.wikipedia.org/wiki/Open%E2%80%93closed_principle)" principle.
   
   Knowing that, and that @uranusjr approves this second approach, and also given the known limitations, this PR makes the SqlSensor pass over the hook_kwargs to the connection hook locator that returns an instantiated hook.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dinigo commented on pull request #17592: Support for passing arguments to SqlSensor underlying hooks

Posted by GitBox <gi...@apache.org>.
dinigo commented on pull request #17592:
URL: https://github.com/apache/airflow/pull/17592#issuecomment-898626742


   Looks like some tests are broken. Could anyone give a little context on the tests?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kazanzhy commented on pull request #17592: Support for passing arguments to SqlSensor underlying hooks

Posted by GitBox <gi...@apache.org>.
kazanzhy commented on pull request #17592:
URL: https://github.com/apache/airflow/pull/17592#issuecomment-923382781


   Is this PR in progress? 
   I've also thought to make the same one to use BigQuery StandartSQL in SqlSensor.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kazanzhy commented on pull request #17592: Support for passing arguments to SqlSensor underlying hooks

Posted by GitBox <gi...@apache.org>.
kazanzhy commented on pull request #17592:
URL: https://github.com/apache/airflow/pull/17592#issuecomment-962288607


   Hi @SamWheating.
   Could you give me advice on what I have to do to move this PR?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal closed pull request #17592: Support for passing arguments to SqlSensor underlying hooks

Posted by GitBox <gi...@apache.org>.
eladkal closed pull request #17592:
URL: https://github.com/apache/airflow/pull/17592


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] SamWheating commented on a change in pull request #17592: Support for passing arguments to SqlSensor underlying hooks

Posted by GitBox <gi...@apache.org>.
SamWheating commented on a change in pull request #17592:
URL: https://github.com/apache/airflow/pull/17592#discussion_r712555765



##########
File path: airflow/sensors/sql.py
##########
@@ -58,17 +63,27 @@ class SqlSensor(BaseSensorOperator):
     ui_color = '#7c7287'
 
     def __init__(
-        self, *, conn_id, sql, parameters=None, success=None, failure=None, fail_on_empty=False, **kwargs
+        self,
+        *,
+        conn_id,
+        sql,
+        parameters=None,
+        success=None,
+        failure=None,
+        fail_on_empty=False,
+        hook_kwargs=None,

Review comment:
       ```suggestion
           hook_kwargs={},
   ```
   
   And then the line below could be reduced to 
   
   ```python
   	self.hook_kwargs = hook_kwargs
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kazanzhy commented on a change in pull request #17592: Support for passing arguments to SqlSensor underlying hooks

Posted by GitBox <gi...@apache.org>.
kazanzhy commented on a change in pull request #17592:
URL: https://github.com/apache/airflow/pull/17592#discussion_r712566434



##########
File path: airflow/sensors/sql.py
##########
@@ -58,17 +63,27 @@ class SqlSensor(BaseSensorOperator):
     ui_color = '#7c7287'
 
     def __init__(
-        self, *, conn_id, sql, parameters=None, success=None, failure=None, fail_on_empty=False, **kwargs
+        self,
+        *,
+        conn_id,
+        sql,
+        parameters=None,
+        success=None,
+        failure=None,
+        fail_on_empty=False,
+        hook_kwargs=None,

Review comment:
       Could you take a look https://github.com/apache/airflow/pull/18394 please?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dinigo commented on pull request #17592: Support for passing arguments to SqlSensor underlying hooks

Posted by GitBox <gi...@apache.org>.
dinigo commented on pull request #17592:
URL: https://github.com/apache/airflow/pull/17592#issuecomment-900431389


   Done rebasing


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kazanzhy commented on a change in pull request #17592: Support for passing arguments to SqlSensor underlying hooks

Posted by GitBox <gi...@apache.org>.
kazanzhy commented on a change in pull request #17592:
URL: https://github.com/apache/airflow/pull/17592#discussion_r712553155



##########
File path: tests/sensors/test_sql_sensor.py
##########
@@ -242,6 +242,22 @@ def test_sql_sensor_postgres_poke_invalid_success(self, mock_hook):
             op.poke(None)
         assert "self.success is present, but not callable -> [1]" == str(ctx.value)
 
+    @mock.patch('airflow.sensors.sql.BaseHook')
+    def test_sql_sensor_bigquery_hook_kwargs(self, mock_hook):
+        op = SqlSensor(
+            task_id='sql_sensor_check',
+            conn_id='postgres_default',
+            sql="SELECT 1",
+            hook_kwargs={
+                'use_legacy_sql': False,
+                'location': 'test_location',
+            },
+        )
+
+        mock_hook.get_connection('google_cloud_default').conn_type = "google_cloud_platform"
+        assert op._get_hook().use_legacy_sql
+        assert op._get_hook().location == 'test_location'

Review comment:
       I think there may be the AssertionError due to `op._get_hook().use_legacy_sql` is False.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #17592: Support for passing arguments to SqlSensor underlying hooks

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #17592:
URL: https://github.com/apache/airflow/pull/17592#issuecomment-898651833


   Please rebase to latest `main`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dinigo commented on a change in pull request #17592: Support for passing arguments to SqlSensor underlying hooks

Posted by GitBox <gi...@apache.org>.
dinigo commented on a change in pull request #17592:
URL: https://github.com/apache/airflow/pull/17592#discussion_r692068934



##########
File path: airflow/sensors/sql.py
##########
@@ -90,7 +105,7 @@ def _get_hook(self):
                 f"Connection type ({conn.conn_type}) is not supported by SqlSensor. "
                 + f"Supported connection types: {list(allowed_conn_type)}"
             )
-        return conn.get_hook()
+        return conn.get_hook(**self.hook_kwargs)

Review comment:
       I changed my mind about who should receive the destructured dict. I suppose you are right. because then we don't risk passing an argument to `get_hook` that is later popped out of the args, or unintendedly defined and taken out of the `get_hook` kwargs. I'll switch it the other way around. Just like discussed in the issue




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #17592: Support for passing arguments to SqlSensor underlying hooks

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #17592:
URL: https://github.com/apache/airflow/pull/17592#discussion_r690955205



##########
File path: airflow/models/connection.py
##########
@@ -286,8 +286,11 @@ def rotate_fernet_key(self):
         if self._extra and self.is_extra_encrypted:
             self._extra = fernet.rotate(self._extra.encode('utf-8')).decode()
 
-    def get_hook(self):
-        """Return hook based on conn_type."""
+    def get_hook(self, hook_kwargs: Dict):

Review comment:
       Should this not be `**hook_kwargs`

##########
File path: airflow/sensors/sql.py
##########
@@ -90,7 +105,7 @@ def _get_hook(self):
                 f"Connection type ({conn.conn_type}) is not supported by SqlSensor. "
                 + f"Supported connection types: {list(allowed_conn_type)}"
             )
-        return conn.get_hook()
+        return conn.get_hook(**self.hook_kwargs)

Review comment:
       … or this should be `conn.get_hook(self.hook_kwargs)`?

##########
File path: tests/sensors/test_sql_sensor.py
##########
@@ -242,6 +242,22 @@ def test_sql_sensor_postgres_poke_invalid_success(self, mock_hook):
             op.poke(None)
         assert "self.success is present, but not callable -> [1]" == str(ctx.value)
 
+    @mock.patch('airflow.sensors.sql.BaseHook')
+    def test_sql_sensor_bigquery_hook_kwargs(self, mock_hook):
+        op = SqlSensor(
+            task_id='sql_sensor_check',
+            conn_id='postgres_default',
+            sql="SELECT 1",
+            hook_kwargs={
+                'use_legacy_sql': False,
+                'location': 'test_location',
+            },
+        )
+
+        mock_hook.get_connection('google_cloud_default').conn_type = "google_cloud_platform"
+        assert op._get_hook().use_legacy_sql
+        assert op._get_hook().location == 'test_location'

Review comment:
       And this test should ensure the kwargs are correctly passed onto the hook.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #17592: Support for passing arguments to SqlSensor underlying hooks

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #17592:
URL: https://github.com/apache/airflow/pull/17592#discussion_r690955205



##########
File path: airflow/models/connection.py
##########
@@ -286,8 +286,11 @@ def rotate_fernet_key(self):
         if self._extra and self.is_extra_encrypted:
             self._extra = fernet.rotate(self._extra.encode('utf-8')).decode()
 
-    def get_hook(self):
-        """Return hook based on conn_type."""
+    def get_hook(self, hook_kwargs: Dict):

Review comment:
       Should this not be `**hook_kwargs`

##########
File path: airflow/sensors/sql.py
##########
@@ -90,7 +105,7 @@ def _get_hook(self):
                 f"Connection type ({conn.conn_type}) is not supported by SqlSensor. "
                 + f"Supported connection types: {list(allowed_conn_type)}"
             )
-        return conn.get_hook()
+        return conn.get_hook(**self.hook_kwargs)

Review comment:
       … or this should be `conn.get_hook(self.hook_kwargs)`?

##########
File path: tests/sensors/test_sql_sensor.py
##########
@@ -242,6 +242,22 @@ def test_sql_sensor_postgres_poke_invalid_success(self, mock_hook):
             op.poke(None)
         assert "self.success is present, but not callable -> [1]" == str(ctx.value)
 
+    @mock.patch('airflow.sensors.sql.BaseHook')
+    def test_sql_sensor_bigquery_hook_kwargs(self, mock_hook):
+        op = SqlSensor(
+            task_id='sql_sensor_check',
+            conn_id='postgres_default',
+            sql="SELECT 1",
+            hook_kwargs={
+                'use_legacy_sql': False,
+                'location': 'test_location',
+            },
+        )
+
+        mock_hook.get_connection('google_cloud_default').conn_type = "google_cloud_platform"
+        assert op._get_hook().use_legacy_sql
+        assert op._get_hook().location == 'test_location'

Review comment:
       And this test should ensure the kwargs are correctly passed onto the hook.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kazanzhy commented on a change in pull request #17592: Support for passing arguments to SqlSensor underlying hooks

Posted by GitBox <gi...@apache.org>.
kazanzhy commented on a change in pull request #17592:
URL: https://github.com/apache/airflow/pull/17592#discussion_r712553155



##########
File path: tests/sensors/test_sql_sensor.py
##########
@@ -242,6 +242,22 @@ def test_sql_sensor_postgres_poke_invalid_success(self, mock_hook):
             op.poke(None)
         assert "self.success is present, but not callable -> [1]" == str(ctx.value)
 
+    @mock.patch('airflow.sensors.sql.BaseHook')
+    def test_sql_sensor_bigquery_hook_kwargs(self, mock_hook):
+        op = SqlSensor(
+            task_id='sql_sensor_check',
+            conn_id='postgres_default',
+            sql="SELECT 1",
+            hook_kwargs={
+                'use_legacy_sql': False,
+                'location': 'test_location',
+            },
+        )
+
+        mock_hook.get_connection('google_cloud_default').conn_type = "google_cloud_platform"
+        assert op._get_hook().use_legacy_sql
+        assert op._get_hook().location == 'test_location'

Review comment:
       I think there may be the AssertionError due to `op._get_hook().use_legacy_sql` is False.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on pull request #17592: Support for passing arguments to SqlSensor underlying hooks

Posted by GitBox <gi...@apache.org>.
eladkal commented on pull request #17592:
URL: https://github.com/apache/airflow/pull/17592#issuecomment-980077828


   Suppressed by https://github.com/apache/airflow/pull/18431


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dinigo commented on pull request #17592: Support for passing arguments to SqlSensor underlying hooks

Posted by GitBox <gi...@apache.org>.
dinigo commented on pull request #17592:
URL: https://github.com/apache/airflow/pull/17592#issuecomment-900431389


   Done rebasing


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dinigo commented on pull request #17592: Support for passing arguments to SqlSensor underlying hooks

Posted by GitBox <gi...@apache.org>.
dinigo commented on pull request #17592:
URL: https://github.com/apache/airflow/pull/17592#issuecomment-898049600


   `mypy` pre-commit hook is having some issue when trying to pull airflow image. I don't know why because breeze works fine. I just disabled it for this commit to be able to pre-provide something readable.
   ```
   docker: read tcp 192.168.65.3:59194->192.168.65.1:3128: read: connection reset by peer.
   See 'docker run --help'.
   ```
   Any help fixing the pre-commit will be much appreciated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #17592: Support for passing arguments to SqlSensor underlying hooks

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #17592:
URL: https://github.com/apache/airflow/pull/17592#issuecomment-962287147


   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dinigo commented on a change in pull request #17592: Support for passing arguments to SqlSensor underlying hooks

Posted by GitBox <gi...@apache.org>.
dinigo commented on a change in pull request #17592:
URL: https://github.com/apache/airflow/pull/17592#discussion_r692071092



##########
File path: tests/sensors/test_sql_sensor.py
##########
@@ -242,6 +242,22 @@ def test_sql_sensor_postgres_poke_invalid_success(self, mock_hook):
             op.poke(None)
         assert "self.success is present, but not callable -> [1]" == str(ctx.value)
 
+    @mock.patch('airflow.sensors.sql.BaseHook')
+    def test_sql_sensor_bigquery_hook_kwargs(self, mock_hook):
+        op = SqlSensor(
+            task_id='sql_sensor_check',
+            conn_id='postgres_default',
+            sql="SELECT 1",
+            hook_kwargs={
+                'use_legacy_sql': False,
+                'location': 'test_location',
+            },
+        )
+
+        mock_hook.get_connection('google_cloud_default').conn_type = "google_cloud_platform"
+        assert op._get_hook().use_legacy_sql
+        assert op._get_hook().location == 'test_location'

Review comment:
       You mean mocking also the BaseHook and checking that call arguments match the `** hook_kwargs`? you are right




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org