You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "saulius (via GitHub)" <gi...@apache.org> on 2023/09/25 14:25:25 UTC

[GitHub] [airflow] saulius opened a new pull request, #34603: Allow passing context to DruidDbApiHook

saulius opened a new pull request, #34603:
URL: https://github.com/apache/airflow/pull/34603

   Druid's SQL API endpoint can accept context param to allow use of various query functionality [described in documentation](https://druid.apache.org/docs/latest/querying/sql-query-context/). This change enables passing context when using `DruidDbApiHook`. For our use case we also do Druid query post processing and values in context allow us to identify Druid query origin, in this case - `airflow`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hussein-awala commented on a diff in pull request #34603: Allow passing context to DruidDbApiHook

Posted by "hussein-awala (via GitHub)" <gi...@apache.org>.
hussein-awala commented on code in PR #34603:
URL: https://github.com/apache/airflow/pull/34603#discussion_r1336375155


##########
tests/providers/apache/druid/hooks/test_druid.py:
##########
@@ -213,6 +214,36 @@ def test_get_auth_with_no_user_and_password(self, mock_get_connection):
         mock_get_connection.return_value = get_conn_value
         assert self.db_hook.get_auth() is None
 
+class TestDruidDbApiHookConn:
+    def setup_method(self):
+        self.connection = Connection(
+            host="test_host",
+            port=10000,
+            login="test_login",
+            password="test_password",
+            extra='{"endpoint": "/test/endpoint", "schema": "https"}'
+        )
+
+        class UnitTestDruidDbApiHook(DruidDbApiHook):
+            conn_name_attr = "druid_broker_conn_id"

Review Comment:
   What is the purpose of this subclass?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] saulius commented on a diff in pull request #34603: Allow passing context to DruidDbApiHook

Posted by "saulius (via GitHub)" <gi...@apache.org>.
saulius commented on code in PR #34603:
URL: https://github.com/apache/airflow/pull/34603#discussion_r1336868185


##########
airflow/providers/apache/druid/hooks/druid.py:
##########
@@ -164,6 +164,10 @@ class DruidDbApiHook(DbApiHook):
     hook_name = "Druid"
     supports_autocommit = False
 
+    def __init__(self, *args, **kwargs) -> None:
+        super().__init__(*args, **kwargs)
+        self.context: dict = kwargs.pop("context", {})

Review Comment:
   Makes sense, updated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] saulius commented on a diff in pull request #34603: Allow passing context to DruidDbApiHook

Posted by "saulius (via GitHub)" <gi...@apache.org>.
saulius commented on code in PR #34603:
URL: https://github.com/apache/airflow/pull/34603#discussion_r1336887361


##########
tests/providers/apache/druid/hooks/test_druid.py:
##########
@@ -213,6 +214,36 @@ def test_get_auth_with_no_user_and_password(self, mock_get_connection):
         mock_get_connection.return_value = get_conn_value
         assert self.db_hook.get_auth() is None
 
+class TestDruidDbApiHookConn:
+    def setup_method(self):
+        self.connection = Connection(
+            host="test_host",
+            port=10000,
+            login="test_login",
+            password="test_password",
+            extra='{"endpoint": "/test/endpoint", "schema": "https"}'
+        )
+
+        class UnitTestDruidDbApiHook(DruidDbApiHook):
+            conn_name_attr = "druid_broker_conn_id"

Review Comment:
   Not necessary, see https://github.com/apache/airflow/pull/34603#discussion_r1336003279



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Allow passing context to DruidDbApiHook [airflow]

Posted by "Taragolis (via GitHub)" <gi...@apache.org>.
Taragolis merged PR #34603:
URL: https://github.com/apache/airflow/pull/34603


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #34603: Allow passing context to DruidDbApiHook

Posted by "uranusjr (via GitHub)" <gi...@apache.org>.
uranusjr commented on code in PR #34603:
URL: https://github.com/apache/airflow/pull/34603#discussion_r1336892230


##########
airflow/providers/apache/druid/hooks/druid.py:
##########
@@ -156,6 +156,11 @@ class DruidDbApiHook(DbApiHook):
 
     This hook is purely for users to query druid broker.
     For ingestion, please use druidHook.
+
+    :param context: dict: Optional query context parameters you might want to pass to
+        Druid's SQL endpoint.
+        See: https://druid.apache.org/docs/latest/querying/sql-query-context/
+        Example: {"sqlFinalizeOuterSketches": True}

Review Comment:
   ```suggestion
       :param context: Optional query context parameters to pass to the SQL endpoint.
           Example: ``{"sqlFinalizeOuterSketches": True}``.
           See: https://druid.apache.org/docs/latest/querying/sql-query-context/
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Allow passing context to DruidDbApiHook [airflow]

Posted by "boring-cyborg[bot] (via GitHub)" <gi...@apache.org>.
boring-cyborg[bot] commented on PR #34603:
URL: https://github.com/apache/airflow/pull/34603#issuecomment-1745089246

   Awesome work, congrats on your first merged pull request! You are invited to check our [Issue Tracker](https://github.com/apache/airflow/issues) for additional contributions.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on a diff in pull request #34603: Allow passing context to DruidDbApiHook

Posted by "Taragolis (via GitHub)" <gi...@apache.org>.
Taragolis commented on code in PR #34603:
URL: https://github.com/apache/airflow/pull/34603#discussion_r1336003279


##########
tests/providers/apache/druid/hooks/test_druid.py:
##########
@@ -213,6 +214,36 @@ def test_get_auth_with_no_user_and_password(self, mock_get_connection):
         mock_get_connection.return_value = get_conn_value
         assert self.db_hook.get_auth() is None
 
+class TestDruidDbApiHookConn:
+    def setup_method(self):
+        self.connection = Connection(
+            host="test_host",
+            port=10000,
+            login="test_login",
+            password="test_password",
+            extra='{"endpoint": "/test/endpoint", "schema": "https"}'
+        )
+
+        class UnitTestDruidDbApiHook(DruidDbApiHook):
+            conn_name_attr = "druid_broker_conn_id"
+
+        self.db_hook = UnitTestDruidDbApiHook(context={'query_origin': 'airflow'})
+        self.db_hook.get_connection = MagicMock()
+        self.db_hook.get_connection.return_value = self.connection

Review Comment:
   I'm not sure that we need to create new hook for testing purpose. 
   You could mock required methods in test cases methods + [`pytest.mark.parametrize`](https://docs.pytest.org/en/latest/how-to/parametrize.html#pytest-mark-parametrize-parametrizing-test-functions)



##########
airflow/providers/apache/druid/hooks/druid.py:
##########
@@ -164,6 +164,10 @@ class DruidDbApiHook(DbApiHook):
     hook_name = "Druid"
     supports_autocommit = False
 
+    def __init__(self, *args, **kwargs) -> None:
+        super().__init__(*args, **kwargs)
+        self.context: dict = kwargs.pop("context", {})

Review Comment:
   Better specify `context` it in `__init__` method and add it into the docstring.
   This gives more information for endusers about druid context and what the different with airflow context



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #34603: Allow passing context to DruidDbApiHook

Posted by "uranusjr (via GitHub)" <gi...@apache.org>.
uranusjr commented on code in PR #34603:
URL: https://github.com/apache/airflow/pull/34603#discussion_r1336892230


##########
airflow/providers/apache/druid/hooks/druid.py:
##########
@@ -156,6 +156,11 @@ class DruidDbApiHook(DbApiHook):
 
     This hook is purely for users to query druid broker.
     For ingestion, please use druidHook.
+
+    :param context: dict: Optional query context parameters you might want to pass to
+        Druid's SQL endpoint.
+        See: https://druid.apache.org/docs/latest/querying/sql-query-context/
+        Example: {"sqlFinalizeOuterSketches": True}

Review Comment:
   ```suggestion
       :param context: Optional query context parameters to pass to the SQL endpoint.
           See: https://druid.apache.org/docs/latest/querying/sql-query-context/
           Example: ``{"sqlFinalizeOuterSketches": True}``
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] saulius commented on a diff in pull request #34603: Allow passing context to DruidDbApiHook

Posted by "saulius (via GitHub)" <gi...@apache.org>.
saulius commented on code in PR #34603:
URL: https://github.com/apache/airflow/pull/34603#discussion_r1336886823


##########
tests/providers/apache/druid/hooks/test_druid.py:
##########
@@ -213,6 +214,36 @@ def test_get_auth_with_no_user_and_password(self, mock_get_connection):
         mock_get_connection.return_value = get_conn_value
         assert self.db_hook.get_auth() is None
 
+class TestDruidDbApiHookConn:
+    def setup_method(self):
+        self.connection = Connection(
+            host="test_host",
+            port=10000,
+            login="test_login",
+            password="test_password",
+            extra='{"endpoint": "/test/endpoint", "schema": "https"}'
+        )
+
+        class UnitTestDruidDbApiHook(DruidDbApiHook):
+            conn_name_attr = "druid_broker_conn_id"
+
+        self.db_hook = UnitTestDruidDbApiHook(context={'query_origin': 'airflow'})
+        self.db_hook.get_connection = MagicMock()
+        self.db_hook.get_connection.return_value = self.connection

Review Comment:
   You're right, I refactored the test case as described.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #34603: Allow passing context to DruidDbApiHook

Posted by "uranusjr (via GitHub)" <gi...@apache.org>.
uranusjr commented on code in PR #34603:
URL: https://github.com/apache/airflow/pull/34603#discussion_r1336891418


##########
airflow/providers/apache/druid/hooks/druid.py:
##########
@@ -156,6 +156,11 @@ class DruidDbApiHook(DbApiHook):
 
     This hook is purely for users to query druid broker.
     For ingestion, please use druidHook.
+
+    :param context: dict: Optional query context parameters you might want to pass to

Review Comment:
   ```suggestion
       :param context: Optional query context parameters you want to pass to
   ```



##########
airflow/providers/apache/druid/hooks/druid.py:
##########
@@ -156,6 +156,11 @@ class DruidDbApiHook(DbApiHook):
 
     This hook is purely for users to query druid broker.
     For ingestion, please use druidHook.
+
+    :param context: dict: Optional query context parameters you might want to pass to

Review Comment:
   ```suggestion
       :param context: Optional query context parameters you want to pass to
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on pull request #34603: Allow passing context to DruidDbApiHook

Posted by "boring-cyborg[bot] (via GitHub)" <gi...@apache.org>.
boring-cyborg[bot] commented on PR #34603:
URL: https://github.com/apache/airflow/pull/34603#issuecomment-1733820598

   Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (ruff, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
   - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
   - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices).
   - Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
   Apache Airflow is a community-driven project and together we are making it better 🚀.
   In case of doubts contact the developers at:
   Mailing List: dev@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org